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ABSTRACT 

This  thesis  studies  the  estimation  from  interarrival  and 
service  time  data  of  the  mean  number  of  customers  in  service 
at  time  t  for  an  M/G/o  queue.  Two  situations  are  considered. 
In  one  the  parametric  form  of  the  service  time  distribution 
is  known.  In  the  special  case  in  which  the  service  time 
distribution  is  exponential  the  approximate  bias  and  vari- 
ance of  the  estimate  are  derived  and  simulation  is  used  to 
study  an  approximate  normal  confidence  interval  procedure. 
Simulation  is  also  used  to  illustrate  that  assuming  a  wrong 
parametric  model  can  lead  to  misleading  results.  In  the 
other  situation,  the  parametric  form  of  the  service  time 
distribution  is  unknown  and  the  empirical  distribution  of 
the  service  times  is  used  in  the  estimate  of  the  mean  number 
of  customers  in  service.  In  the  case  in  which  the  customer 
arrival  rate  is  known  the  distribution  of  the  estimate  is 
derived  and  an  approximate  normal  confidence  interval  proce- 
dure is  suggested.  The  use  of  the  bootstrap  and  jackknife 
procedure  to  estimate  variability  and  construct  confidence 
intervals  for  the  estimate  is  also  studied  both  analytically 
and  by  simulation. 
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I.  INTRODUCTION 

A.   DESCRIPTION  OF  THE  PROBLEM 

The  application  of  probability  theory  to  a  wide  variety 
of  congestion  problems  has  been  described  in  many  papers  and 
books  [ Ref s.  1,2,3].  Results  of  queueing  theory  are 
presented  in  terms  of  component  distribution  functions  and 
stochastic  processes  (renewal,  Poisson,  etc)  that  are  taken 
as  known;  only  rarely  are  issues  addressed  that  arise  when 
actual  data  is  to  be  used  as  a  basis  for  inference  from  the 
models;  however,  see  Cox(1965)  [Ref.  4]. 

The  concern  of  this  thesis  is  inference  problems  for  a 
particularly  simple  queueing  model,  the  M/G/co  queue.  In 
this  model,  customers  arrive  according  to  a  Poisson  process 
with  rate  X  and  there  are  an  unlimited  number  of 
independent  servers.  Service  times  for  each  server  are 
independent,  identically  distributed  with  distribution 
function  F.  Let  X(t)  be  the  number  of  customers  being 
served  at  time  t.  It  is  well  known  that  if  there  are  no 
customers  being  served  at  time  0,  then 

P(X(t)=n}  =  £HC-.>1-  exp[-M(t)]  (1.1) 

where 

M(  t)  =  X  5otf(  s)ds 
with  F(t)=l-F(t)  [Ref.  2].   Thus  the  distribution  of  X(t)  is 
Poisson  and  is  characterized  by  its  mean  M(t). 

In  this  thesis  we  will  assume  that  the  service  time 
distribution  F(t)  is  unknown  and  must  be  estimated  from 
service  time  data  and  that  the  arrival  process  is  known  to 
be  Poisson,  except  possibly  for  its  rate  .  We  will  study 
the  estimation  of  the  mean  number  of  customers  being  served 
at  time  t,  M( t). 


B.   SCOPE  OF  THE  THESIS 

The  purpose  of  this  thesis  is  to  study  the  estimate  of 
the  mean  number  of  customers  being  served  at  time  t  for  a 
M/G/co  queue.  This  mean  completely  characterizes  the 
distribution  of  the  number  of  customers  being  served  at  time 
t.  We  will  assume  that  the  service  time  distribution  and 
possibly  the  customer  arrival  rate  are  unknown  and  must  be 
estimated  from  data. 

We  generally  divide  the  estimation  method  into  two  cases 
which  we  shall  call  "parametric  estimation"  and 
"nonparametric  estimation".  In  the  parametric  estimation 
case,  a  particular  probabilistic  model  is  specified  for  the 
service  time  distribution  and  the  parameters  of  the 
distribution  are  estimated.  The  resulting  estimate  of  the 
survivor  function  is  then  used  in  the  estimate  of  the 
expected  number  of  customers  being  served  at  time  t.  In  the 
nonparametric  estimation  method,  the  empirical  survivor 
function  is  used  in  the  estimate  of  the  expected  number  of 
customers. 

In  most  cases,  parametric  assumptions  concerning  the 
service  time  distribution  are  difficult  to  justify.  Hence 
nonparametric  estimation  procedure  may  well  be  preferred  to 
parametric  estimation  when  actual  data  is  used.  However, 
the  nonparametric  estimates  can  be  expected  to  be  less 
efficient  than  the  parametric  ones. 

The  thesis  is  organized  as  follows.  In  Chapter  II,  the 
transient  distribution  for  the  number  of  customers  being 
served  at  time  t  for  the  M/G/to  model  is  described  and  the 
equilibrium  distribution  as  time  goes  to  infinity  is 
obtained.  In  Chapter  III,  we  study  parametric  estimates  of 
the  mean  number  of  customers  being  served  under  several 
assumptions  for  service  time  distributions.  In  the  special 
case  in  which  the  service  time  distribution  is  exponential 
the   approximate   bias   and  variance   of   the   estimate   are 
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derived  and  simulation  is  used  to  study  an  approximate 
normal  confidence  interval  procedure.  Parametric  estimates 
for  gamma,  mixed  exponential,  and  lognormal  distributions 
are  also  considered.  Simulation  is  used  to  study  the  effect 
of  assuming  a  wrong  parametric  model.  In  Chapter  IV,  a 
nonparametric  estimate  of  the  mean  number  of  customers  being 
served  is  described.  This  estimate  is  based  on  the 
empirical  distribution  of  the  service  times.  In  the  case  in 
which  the  customer  arrival  rate  is  assumed  known  the 
distribution  of  the  nonparametric  estimate  is  derived  and  an 
asymptotic  normal  confidence  interval  procedure  is 
suggested.  The  jackknife  and  Bootstrap  methods  for 
obtaining  approximate  confidence  intervals  are  also 
described.  The  different  estimators  are  compared  by 
simulation.  Chapter  V  describes  the  simulation  and  gives 
the  results. 

In  summary,   this  thesis  studies  the  use  of  estimates  of 
the  service   time  distribution   to  obtain   estimates  of   the 
mean  number   of  customers   being  served   for  a  M/G/Co  queue. 
Both  parametric   and  nonparametric  estimates   are  considered 
and  compared  by  simulation. 
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II.  M/G/oo  QUEUE  MODEL 

The  M/G/oo  queueing  model  is  specified  by  the  following 
assumptions.  There  are  infinite  number  of  servers. 
Customers  arrive  for  service  according  to  a  Poisson  process 
with  rate  \  .  Service  times  are  nonnegative  independent 
identically  distributed  random  variables  with  distribution 
function  F.  When  a  customer  arrives,  he  immediately  starts 
service. 

Let  X(t)  represent  the  number  of  istomers  in  service  at 
time  t.  It  is  well  known  that  if  -here  are  no  customers 
being  served  at  time  0,  then 

P{X(t)=k}  =  -"£-,---  exp[-Ap(t)]  (2.1) 

where  p(t)=  j  [1-F(s)]ds:  that  is,  X(t)  has  a  Poisson 
distribution  with  mean  \p(t)  [ Ref .  2],  Taking  the  limit  as 
t  -too  in  equation  2.1,  we  obtain  the  equilibrium 
distribution 

lim  P[X(t)=k}  =  L^JzE^^expt-XHl-FCxJdx]     (2.2) 


t-*oc 


kf 


Thus,  the  limiting  distribution  of  X(t)  as  t-»co  is  also 
Poisson  with  mean  /\  m,  where  m=  J  F(x)dx  is  the  mean  service 
time.  Therefore,  the  distribution  of  the  number  of 
customers  being  served  at  time  t  is  Poisson  with  mean 

M(  t)  =  \  ftF(x)dx  (2.3) 

Here,   the   distribution  of   the  number  '  of  customers   being 
served  at  time  t  is  characterized  by  value  of  its  mean  M(t). 
The  value  of  M(t)  depends  upon  the  service  time  distribution 
which  is   assumed  unknown  and   must  be  estimated   from  data. 
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This  thesis  considers   the  problem  of  estimating  M(t)   from 
service  and  interarrival  time  data. 
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III.  PARAMETRIC  ESTIMATION  METHOD 

A.   DESCRIPTION 

In  this  chapter,  it  will  be  assumed  that  the  parametric 
form  of  the  service  time  distribution  is  known.  In  this 
case  the  estimation  of  the  mean  number  of  customers  being 
served  at  time  t,  M(t),  can  be  considered  to  be  a  function 
of  the  parameter  estimates  of  the  distribution.  In 
particular,  the  estimate  of  M(t),  when  a  parametric  form  of 
the  service  time  distribution  is  assumed,  is  denoted  by 
Mp(t),  then 

Mp(t)  =*  L  F(s)ds  (3.  1) 

where   F(  t)    is   a   survivor   distribution   of   an   assumed 
parametric  form. 

In  this  chapter  the  rate  of  the  arrival  process  will  be 
assumed  to  be  unknown.  Maximum  likelihood  estimates  of  the 
mean  interarrival  times  and  the  mean  service  times  are  used 
in  the  estimate  of  Mp(t). 

Four  parametric  service  time  distributions  will  be 
considered:  the  exponential,  the  gamma,  the  mixed 
exponential,  and  the  lognormal  distribution.  In  the 
exponential  case,  moment  approximations  are  used  to  assess 
the  bias  of  the  estimate  and  to  develop  a  confidence 
interval  procedure  based  on  asymptotic  normality.  The 
performance  of  the  confidence  interval  procedure  is  assessed 
by  simulation. 

In  the  remaining  three  parametric  models,  simulation  is 
used  to  assess  the  performance  of  the  parametric  estimates. 
Another  source  of  error  in  using  a  parametric  estimator  is 
that  the  wrong  parametric  form  may  be  used.  The  effect  of 
using  the  (wrong)  exponential  model  in  these  cases  is  also 
assessed  by  simulation. 
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Each  simulation  has  300  replications;  each  replication 
consists  of  50  independent  service  times  from  the  specified 
distribution,  and  50  independent  interarrival  times  from  an 
exponential  distribution.  The  average  relative  bias  and  the 
average  relative  square  error  of  Mp(t)  are  used  to  evaluate 
the  performance  of  the  parametric  estimation  method.  All 
simulations  were  carried  out  on  an  IBM  3033  computer  at  the 
Naval  Postgraduate  School  using  the  LLRANDOMI I ,  random 
number  generating  package  [ Ref .  6]  . 

B.   EXPONENTIAL  SERVICE  TIME 

In  this  section  it  will  be  assumed  that  the  service  time 
distribution  is  exponential;  that  is,  F(t)  =  1-EXP(  -t/jj^  ), 
where  XX.  is  an  unknown  parameter  and  must  be  estimated  from 
the  observed  data.  The  maximum  likelihood  estimate  of  A4.  is 
XK=  —  ^_  x-  ,  where  x-  is  the  service  time  of  the  i  customer. 
We  will  also  assume  that  the  rate  of  the  Poisson  arrival 
process  \  is  unknown  and  must  be  estimated. 

The  interarrival  times  of  the  customers  are  denoted  by  y, 
/  Yi.  /  •  •  /  y<v\  •  Since  the  arrival  process  is  Poisson  with  rate 
\  ,  the  interarrival  times  are  mutually  independent, 
positive  random  variables  with  the  exponential  distribution 
function  having  mean  -r-  .  The  maximum  likelihood  estimate 
of  <\  is  X  =n/ x.  y;  .  For  an  exponential  service  time 
distribution,  an  estimate  of  the  mean  number  of  customers  in 
service  at  time  t  for  an  M/G/to  queue  is 


p(t)  =\-M(l-exp[-t/*U)  (3.2) 


The   estimate   is   a  nonlinear   function   of   the   estimated 

A  A 

parameters,    \  and  U  .    In  most   cases,   when  estimating  a 
function  of  the  estimated  parameters,  bias  is  created  by  the 
nonlinear    relationship    of   the    estimated   parameters. 
Approximate  formulas  for  the  bias  and  variance  of  Mp(t)  will 
now  be  derived. 
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Let  3  be  the  mean  service   time,   (3  =—  £  x^  ,  i=l,  2, .  .  ,  n, 
and  o<  be  the  mean  mterarrival  time,  o/ =  —  2  y.  ,  i=l,2,..,n. 
By  assumption,  X;  and  Y^  are  independent.   The   estimate   of 
Mp(t)   can  be  represented  by  a  function  of  the  parameters  o< 

A. 

and  8    as  follows: 

M(3(,Q)  =  -J-ji  (l-exp[-t/£]  )  (3.3) 

There   are   no  simple,    exact   formulas   for  the   mean   and 
variance  of  the  quotient  of  two  random  variables.    However, 
there  are   approximate  formulas  which  are   sometimes  useful. 
The  approximation   can  be  obtained   from  the   partial  Taylor 
series  expansions  of  M(o(,(3)  about  the  true  means,   o(  and  (3  . 
The  expansion  is 

N(2{,j»)  =  M(o(,0)  +  ™M(<*,/3)$-oQ  +  vft-M(*fa)(£-(3) 

+  -rl*^)^^^?)  +  I  -^jM(«/(3)(^-o()2 

+  X   iVM(^,/3)(0-p)2  +  Rn  (3.4) 

Since  we  assumed  that  the  arrival  process  and  the  service 
times  are  independent,  the  covariance  terms  turn  out  to  be 
zero  when  we  take  the  expectation  of  both  sides  of  equation 
3.4.   Thus,  we  get 

E[M(S<,£)]  =  M(o*,£)  +  ^  ^-rM(o<,0)Var(oO  +  -  --rM(o/ /(a  )Var(£  ) 

+  R^  (3.5) 

where  R^  converges  to  zero  at  the  rate  —  .   The  variance  of 

estimate    is 
A     * 


9 


Var[M(o</^)]    =    [  -2j-M(  ol  fp  )  ]  2Var(S< )    +    [ -^-M((X,^3  )  ]  2Var(p  ) 

+   Rn  (3-  6) 
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with  Rn  tending  to  zero  at   the  rate  -jk.     An  approximate 
bias  term,  denoted  by  (3p(t),  can  be  derived  immediately  from 
the  equation  3.5,   that  is,    (3p(  t)=E[  M(tf,  £  )-E[  M(  £(,£)]]  . 
Subtracting  (3p(t)   from  the  parametric  value  to  correct  the 
bias,  leads  to  the  bias  corrected  estimate  of  Mp(t). 

In  order  to  compare  the  two  estimates,  bias  and 
bias-corrected,  we  define  the  following  notation.  Let  0V  be 
the  fraction  of  bias  of  Mp  (t)  against  the  true  value  M(t), 
and  0a  be  the  fraction  of  square  error  of  Mp(t)  against  the 
square  of  the  true  value:  that  is, 


0  =  Jlcil:J^£!*2 

"~M~(~t> 


(3.7) 


ncto 


t- 


(3.8) 


where  M(t)  =  \\  F(  s)ds 
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Figure  3.1   Average  Relative  Bias  of  Mp(t) 
for  Exponential  with  M =2  at  t=l 
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A  simulation  experiment  was  performed  to  assess  the 
performance  of  the  estimates.  In  the  i*^  replication,  50 
exponential  interarrival  times  having  mean  1  and  50 
exponential  service  times  having  mean  2  were  simulated  and 
estimates 

Mp(t)  =\  M(  l-exp[-t/u]  )  (3.9) 

and 

Mp(t)  =  Mp(t)  -  (3p(t)  (3.10) 

where 

(3p(t)  =  i  -|rM(o(,£)Var(5o  +  ^_  sflW  <P  )Var(<8 ) 

were  computed.  The  estimated  values  of  %(  and  A  were  used 
in  the  variance  formulas.  The  simulation  was  replicated  300 
times  and  the  average  relative  biases 

e,(t,  =    '.-^i-^iis.,  ,3.11) 

1        300  <  =  i       MCk) 


ef(t)  .^--^[Marifefe-]  (3.i2) 

300  *.\      M(;t; 
and  the  average  square  error 

Ql(t)-  i-.r[-M-^--^.]a  (3.13, 

300      Jwsrl  Met} 


30o  *-\      Mct> 

were  computed. 

Results  of  the   simulation  are  presented  in   Figures  3.  1 
and  3.2.    In  Figure  3.1,   the  dotted  line  shows  the  average 
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Figure  3.2   Average  Relative   quare  Error  of  Mp(t) 
for  Exponential  with  U.  =2  at  t=l.   (  n=50  ,  r=300  ) 

relative  bias,  9,(t),    as  a  function  of  t   for  the  original 


-c 


The  solid  line  shows  6,(t)    for  the  bias 


estimate  Mp(  t) . 

corrected   estimate    Mp(t).     This    superimposed   figure 

— c 
indicates  that  0((t)   for   the  bias-corrected  estimate  (with 

solid  line)    is  almost   constant  and   is  small.     The  bias 

estimate  produces  large   negative  value  of  6v(t)    but  9\(t) 


approaches  a  limiting  value  as  t   -fOO 


Figure  3.2  shows  of 


-c 


the  average  relative  square  error  9*(t)  and  94(t)  plotted  as 
a  function  of  time.  The  dotted  line  gives  6a.(t)  and  the 
solid  line  is  0Jt).  It  appears  from  figures  that  the 
estimate  of  bias  described  in  equation  3. 5  does  correct  for 
the  bias.  However,  in  Figure  3.2  the  bias-corrected 
estimate  has  a  slightly  higher  relative  square  error  than 
the  original  estimate.  This  higher  relative  square  error 
could  be  due  to  correlation  between  the  estimate  itself  and 
the  estimate  of  its  bias. 

Simulation  was   used  to   assess  the   performance  of   the 
following   confidence   interval   procedure.      In   the   i  n 
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TABLE  I 

COVERAGE  AND  LENGTH  OF  100( l-«)%  C.I.  FOR 

THE  ORIGINAL  ESTIMATE   (  N  =  50,   R  =  300  ) 


trial 

68  % 

80  % 

90  % 

Length 
(s.d) 

C.  R. 

Length 
(s.d) 

C.  R. 

Length 
(s.d) 

C.  R. 

1 

0. 2234 
(0.  0415) 

10.  67 
72.  00 
17.  33 

0. 2879 
( 0. 0535) 

4.  00 
83.  00 
13.  00 

0.  3695 
( 0. 0687) 

1.  00 

90.  00 

9.  00 

2 

0. 2237 
( 0. 0396) 

9.  33 
73.  33 
17.  33 

0. 2882 
( 0. 0510) 

4.  33 
81.  00 
14.  67 

0.  3699 
(0.  0655) 

1.  33 
86.  33 
12.  33 

3 

0. 2224 
( 0. 0410) 

11.  00 

68.  67 
20.  33 

0. 2866 
( 0. 0529) 

4.  33 
79.  00 
16.  67 

0. 3678 
( 0. 0678) 

1.  33 
87.  33 
11.  33 

4 

0.  2212 
(0.  0402) 

9.  00 
66.  67 
24.  33 

0. 2851 
( 0. 0519) 

3.  67 
76.  00 
20.  33 

0. 3659 
(0. 0666) 

1.  00 
83.  33 
15.  67 

5 

0.  2207 
( 0. 0409) 

13.  33 
62.  33 
24.  33 

0. 2844 
( 0. 0527) 

4.  67 
77.  67 
17.  67 

0. 3651 
(0.  0677) 

0.  67 
88.  67 
10.  67 

6 

0.  2269 
( 0. 0433) 

11.  33 

70.  00 
18.  67 

0.2925 
( 0. 0558) 

3.  67 
82.  67 
13.  67 

0. 3754 
(0.  0717) 

0.  33 
89.  33 
10.  33 

7 

0. 2223 
( 0. 0375) 

10.  67 
69.  00 
20.  33 

0. 2865 
( 0. 0483) 

3.  00 
82.  00 
15.  00 

0. 3677 
(0.  0620) 

0.  33 
89.  67 
10.  00 

8 

0.  2246 
(0.  0425) 

7.  67 
71.  67 
20.  67 

0. 2895 
(0.  0548) 

2.  33 
81.  33 
16.  33 

0.  3715 
( 0. 0704) 

0.  33 

87.  67 
12.  00 

9 

0. 2232 
(0.  0423) 

11.  00 
63.  67 
25.  33 

0. 2876 
(0.  0545) 

6.  00 
82.  67 
21.  33 

0. 3692 
(0.  0700) 

0.  67 
83.  67 
15.  67 

10 

0. 2270 
( 0. 0411) 

13.  67 
67.  67 
18.  67 

0. 2926 
(0.  0530) 

3.  00 

81.  00 
16.  00 

0. 3755 
(0. 0680) 

0.  33 
87.  33 
12.  33 

Average 

0. 2235 
( 0. 0410) 

10.  77 
68.  50 
20.  73 

0. 2881 
( 0. 0528) 

3.  90 
80.  63 
15.  47 

0.  3697 
( 0. 0678) 

0.  74 
87.  33 
11.  93 

replication  of  the  simulation,  50  exponential  interarrival 
times  having  mean  1,  and  50  exponential  service  times  having 
mean  2  were  generated,   and  Mp(t)   and  Mp(t)   were  computed. 

A. 

The  approximate  variance  of  Mp(t)  was  computed  for  t=l  using 
equation  3.6  with  Rn=0.  The  100(l-o()%  confidence  limits  L 
and  U  were  computed  by 
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TABLE  II 

COVERAGE  AND  LENGTH  OF  100(1-*)%  C.  I.  FOR 
THE  BIAS-CORRECTED   ESTIMATE  (  N=50,  R=300  ) 


trial 

68  % 

80  % 

90  % 

Length 
(s.d) 

C.  R. 

Length 
(s.d) 

C.  R. 

Length 
(s.d) 

C.  R. 

1 

0.  2234 
(0.  0415) 

14.  67 
69.  00 
16.  33 

0. 2879 
(0.  0535) 

6.  67 
80.  67 
12.  67 

0. 3695 
(0.  0687) 

2.  33 

89.  00 
8.  67 

2 

0.  2237 
(0.  0396) 

13.  67 
69.  67 
16.  67 

0.2882 
(0.  0510) 

5.  67 
80.  33 
14.  00 

0. 3699 
(0.  0655) 

2.  67 
86.  33 
11.  00 

3 

0.  2224 
(0.  0410) 

16.  00 
66.  00 
18.  00 

0. 2866 
(0.  0529) 

7.  67 
76.  33 
16.  00 

0. 3678 
(0.  0678) 

2.  00 
87.  33 
10.  67 

4 

0.  2212 
(0.  0402) 

14.  33 
62.  33 
23.  33 

0. 2851 
(0.  0519) 

6.  00 
74.  33 
19.  67 

0. 3659 
(0.  0666) 

1.  67 
83.  00 
15.  33 

5 

0. 2207 
(0.  0409) 

17.  67 
59.  67 
22.  67 

0. 2844 
( 0. 0527) 

9.  00 
74.  67 
16.  33 

0. 3651 
(0.  0676) 

1.  67 

88.  67 

9.  67 

6 

0.  2269 
( 0. 0433) 

18.  67 
64.  00 
17.  33 

0. 2925 
(0. 0558) 

6.  33 
81.  33 
12.  33 

0.  3754 
(0. 0717) 

2.  00 
88.  00 
10.  00 

7 

0.  2223 
(0.  0375) 

14.  67 
67.  00 
18.  33 

0. 2865 
(0.  0483) 

5.  67 
80.  33 
14.  00 

0. 3677 
( 0. 0620) 

2.  00 

88.  67 

9.  33 

8 

0.  2246 
(0.  0425) 

14.  00 
67.  33 
18.  67 

0.  2895 
(0. 0548) 

4.  67 
80.  00 
15.  33 

0.  3715 
(0. 0704) 

0.  67 
88.  00 
11.  33 

9 

0.2232 
(0.  0423) 

15.  67 
60.  00 
24.  33 

0.  2876 
( 0. 0545) 

9.  00 
70.  67 
20.  33 

0. 3692 
(0.  0670) 

1.  67 
83.  33 

15.  00 

10 

0. 2270 
(0. 0411) 

18.  33 
63.  00 
18.  67 

0.  2926 
(0. 0530) 

9.  33 
75.  33 
15.  33 

0. 3755 
(0.  0680) 

1.  67 
86.  00 
12.  33 

Average 

0.  2235 
(0.  0410) 

15.  77 
64.  80 
19.  43 

0.  2881 
( 0. 0528) 

7.  00 
77.  40 
15.  60 

0. 3697 
(0.  0678) 

1.  84 
86.  83 
11.  33 

L   = 


Mp(t) 


Z'-iVVarCMCS(^)J 


(3.15) 


and 


U  =  Mp(t)    +    z^  yvfcrCMtoL^l 


(3.16) 


21 


where  z  .  «  is  the  upper  1-  ^  point  of  the  standard  normal 
distribution.  Tables  I  and  II  show  the  results  of  10 
independent  simulations  for  the  original  and  the 
bias-corrected  estimate.  Each  simulation  was  replicated  300 
times.  Tables  report  the  average  and  standard  deviation  of 
the  normal  confidence  interval  length;  the  proportion,  p  of 
the  intervals  that  covers  the  true  value;  the  proportion  of 
intervals  that  are  too  high,  (e.g.  M(t)  <  L  );  and  the 
proportion  of  intervals  that  are  too  low, (e.g.  M(t)  >  U  ). 

Since  the  simulation  replications  are  independent,  it  is 
possible  to  assess  the  uncertainty  of  p.  If  the  confidence 
interval  procedure  is  correct,  then  p  should  be  within 
approximately  ±  2  J  ^ c ' "^  o f  l-o(.  The  coverage  rate  in  the 
tables  indicate  that  the  parametric  estimates  tend  to 
underestimated.  Obviously,  the  distribution  of  Mp(t)  is 
skewed  right.  However,  the  confidence  interval  procedure 
works  well,  regardless  for  both  the  original  and  the 
bias-corrected  estimate.  Both  have  a  variable  coverage 
rate.  The  difference  of  performance  between  two  estimates 
is  not  significant. 

C.   OTHER  SERVICE  TIME  DISTRIBUTIONS 

1.   Mixed  Exponential  Service  Time 

In  this  subsection,  service  times  having  a  mixed 
exponential  parametric  form  will  be  considered.  Customers 
arrive  according  to  a  Poisson  process  with  unknown  rate 
\  which  must  be  estimated.  Customers  are  of  two  types;  with 
probability  P(  ,  a  customer's  service  time  is  exponential 
with  mean  m{  ;  with  probability  Pz ,  the  customer's  service 
time  is  exponential  with  mean  Mi..  If  T  is  the  service  time 
of  an  arbitrary  customer,  then 

P(T  >  t)  =  P,  exp[ -t/u,  ]  +  Pa.exp[ -t/MaJ  (3.17) 
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where  P,   +  P a  =  1.     In  this   case,   the   mean  number   of 
customers  being  served  at  time  t  is 

Mp(  t)  =  \  \  P,  Mt  C  i-  expc-t  AO)  ■+  PaXta  c  t  -  e*p  c-t/<o>j(  3. 18) 

We  will  assume   that  a  customer's  type  and   service  time  are 
observable. 


Figure  3.3    Average  Relative  Bias  of  Mp(t)  for 
Mixed  Exponential  with  JU,=2,  #*.=.  75,  P,  =.  2  at  t=l 

(  n=50,   r=300  ) 

A  simulation  experiment  was  done  to  assess  the 
performance  of  Mp(t).  In  the  i""  replication,  service  times 
for  50  customers  were  generated,  where  the  proportion  P,  of 
them  have  a  type  1  and  the  proportion  Fx  have  a  type  2.  A 
random  number  was  drawn  to  determine  the  type  of  customer. 
If  a  customer  was  of  type  i,  a  service  time  was  generated 
from  the  exponential  distribution  with  mean  uc  .  For  the 
simulation,  the  proportion  P,  is  assumed  known.  The  50 
independent  interarrival  times  were  also  generated.   In  this 
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Figure  3.4   Average  Relative  Square  Error  of  Mp(t)  for 
Mixed  Exponential  with«,=2,  U*=.  75,  P,  =.  2  at  t=l 

(  n=50,   r=300  ) 

case  ,  P,=0.2,  iA{  =2.  ,  #i=0. 75,  and  A=l.  The  estimate  Mp(t) 
was  computed;  the  estimate  of  Pi  is  the  proportion  of 
customers  that  were  type  i;  the  estimate  of  the  mean  service 
time  Ma  was  the  mean  service  time  of  type  i  customers;  the 
estimate  of  \    was  the  mean  interarrival  time.   The  estimate 

A 

Mg(t)  of  M(t)  assuming  that  the  service  time  distribution  is 
exponential  was  also  computed,  that  is, 


M-(t)  =\M  (  l-exp[-t/w]  ) 


(3.19) 


with  Kk  equal  to  the  mean  service  time  and  \  is  equal  to 
the  mean  interarrival  time.  The  procedure  was  replicated 
300  times.  Results  of  the  simulation  appear  in  Figures  3. 3 
and  3.4.  0»(t)  is  the  average  relative  bias  of  the  estimate 
and  0a.(t)  is  the  average  relative  square  error  as  computed 
in   equation  3.  11   and  3.  13.     The   solid  line   is  for   the 

A 

correct  parametric  model   estimate  Mp(t).    The   dotted  line 

A 

represents  the  exponential  model  estimate  Me(t). 
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In  both  figures,  we  compare  the  correct  model 
(represented  by  a  solid  line)  with  the  erroneous  exponential 
model.  As  expected,  the  exponential  model  clearly  does  not 
perform  as  well  as  the  mixed  exponential  model.  In  Figure 
3.3,  the  level  of  bias  for  the  exponential  model  is  very 
high  early  in  time,  but  is  reduced,  and  stabilizes  as  t  -*oo  . 
Notice  that  the  estimate  using  the  exponential  model  appears 
too  large.  The  average  relative  square  error  of  the 
estimates  is  shown  in  Figure  3. 4.  Although  the  results  of 
exponential  model  shows  a  slightly  higher  mean  square  error 
than  that  of  the  mixed  exponential  model,  the  results  of 
exponential  model  are  not  too  much  worse.  This  is  not 
surprising,  since  as  t  -too  the  estimate  just  depends  on  the 
mean. 

2.   Gamma  Service  Time 

In  this  subsection,  the  parametric  form  of  the 
service  time  distribution  is  gamma.  The  probability  density 
function  is 

f(t)  =  -j-  (f   tK"1  e~^t  (3.20) 

Too 

where  k  and  3  are  strictly  positive  parameters  of  the 
distribution,  and  k  is  further  assumed  to  be  an  integer.  By 
successive  integrations  by  parts,  we  get 

F(t)  =  1  -  X  e"at  C<3t>*  (3.21) 

A=0  i  / 

its  mean  and  standard  deviation  are 

Thus,  k  is  the  parameter  that  specifies  the  degree  of 
variability  of  the  service  times  relative  to  the  mean. 

Since  the  arrival   process  is  Poisson  with   rate  \   , 
the  interarrival   times  are  mutually   independent,   positive 
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random  variables  with  the  distribution  function 
G( y)=l-EXP( -  \  y) ,  where  \  must  be  estimated  from 
interarrival  data  y;  ,  i=l  to  n.  Thus,  the  Mp(t)  in  this 
case  is  obtained  by  the  successive  integrations  by  parts  of 
the  survival  function  of  the  gamma  distribution,  that  is 

"r^-MiEU-i^-TT^  (3.22, 


Figure  3.5    Average  Relative  Bias  of  Mp(t) 
for  Gamma  with   3=1,  k=2  at  t=l.   (n=50,   r=300  ) 

The  performance  of  Mp  (t)  was  evaluated  by  the 
simulation.  In  the  simulation,  the  parameter  k  of  the  gamma 
distribution  was  assumed  to  be  known,  but  the  rate  of  the 
arrival  process  is  unknown  and  is  estimated.  Two  simulation 
cases  were  run.  In  the  i^  replication  of  the  simulation,  50 
independent  service  times  were  generated,  where  the  first 
simulation  case  used  the  gamma  distribution  having  -3  =1  and 
k=2  and  the  second  case  used  the  gamma  distribution  having 
(3  =0. 5  and  k=4;  50  independent  interarrival  times  having 
(3=1  were  also  generated.   For  k=2 ,  the  estimate  is 
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Mp(t)    =  \(-7-(l-exp[-Jt]     -    t-exp[-At] 


(3.  23) 


and   for  k=4,    the   estimate   is 


M 


l    V- 


p(t)      =\{-?-(l-exp[-at]    -   exp[-/at](3t   +  p  t1  +    -v-At*} 

r  a  b  (  3.  24 


■n 


where  ^  =n/i  y.  and  (3  =kn/2  xA  were  calculated.  An  estimate 
based  on  an  erroneous  exponential  service  time  model 
parametric  estimator  was  also  calculated 


Me(t)  =  \M(l-exp[  -t/u]  ) 


(3.  25) 


with  M  =-S  x- 


Figure  3.6    Average  Relative  Bias  of  Mp(t) 
for  Gamma  with  Q=. 5,  k=4  at  t=l.   (  n=50,   r=300  ) 

The  simulation  was  replicated  300  times.  The 
average  relative  bias  6,(t)  and  the  average  relative  square 
error    9u.(t)   were   calculated.    These   results  appear   in 
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Figure  3.7    Average  Relative  Square  Error  of  M  (t) 
for  Gamma  with   3  =1,  k=2  at  t=l.   (  n=50,   r=300  ) 

Figures  3.5,  3.6,  3.7,  and  3.8.  The  dotted  lines  in  the 
figures  show  the  results  of  the  erroneous  exponential  model. 
The  solid  line  represents  the  results  of  the  gamma  model. 
The  figures  clearly  show  that  the  performance  of  the 
exponential  model  is  not  as  good  as  that  of  the  gamma  model 
initially.  However,  both  models  have  the  same  equlibrium 
state  for  t  >  15,  approximately.  Figure  3.5  shows  the 
average  relative  bias  of  the  estimate  in  the  case  of  k=2  and 
Figure  3.  6  shows  the  same  in  the  case  of  k=4.  In  both 
figures,  the  erroneous  exponential  model  has  a  high  level  of 
bias  and  the  bias  of  the  gamma  model  is  almost  constant; 
however,  both  models  have  exactly  the  same  limiting  value  as 
t  -*00  .  The  average  relative  square  error  appears  in  the 
Figures  3.  7  and  3. 8.  Figure  3. 7  represents  the  average 
relative  square  error  of  M»(t)  and  Me(t).  in  case  of  k=2 , 
and  Figure  3. 8  represents  the  same  in  the  case  of  k=4.  In 
Figure  3.7,  the  exponential  model  has  a  poorer  performance 
than  the   gamma  model,    but  the   difference  is   small.    In 
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Figure  3.8,  the  exponential  model  has  a  large  value  of  mean 
square  error  initially  and  the  level  of  mean  square  error 
associated  with  the  exponential  model  grows  as  the  value  of 
parameter  k  is  increased. 


Figure  3.8    Average  Relative  Square  Error  of  Mp(t) 
for  Gamma  with   3 =. 5,  k=4  at  t=l.   (  n=50,   r=300  ) 

3.   Lognormal  Service  Time 

In  this  subsection,  the  service  time  distribution  is 
assumed  to  be  lognormal.  Let  X  be  a  random  variable,  and 
let  a  new  random  variable  Y  be  defined  as  Y=ln  X.  If  Y  has 
a  normal  distribution,  then  X  is  said  to  have  a  lognormal 
distribution.  The  density  of  a  lognormal  distribution  is 
given  by 

\ 


f(x)    =   — 1  — eXp[ \Z(ln  x    "S   )^ 


( 3. 26) 
where      -00<   §    <  00    and      cf  >0.  Set      Y=ln     X   -  §       and  by      the 
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integration 

P{X  $  x)    =   F^(x) 

_  i_   r^*-§ 

~  UTT  J-oo  eNpC-zVacf4]   Ay 

=  C,Y  (t-^i.)  (3.27) 

where  G  (  ■  )   is  the  standard  normal  distribution  function. 
If  the  service  time  has   a  lognormal  distribution,   then  the 
estimated  mean  number  of  customers   being  served  at   time  t, 
Mp(t),  is  obtained  by  integration-by-parts 


M 


p(t)  =  \  ft[l-F(t)]  +  jtsF(s)ds}  (3.28) 


Substituting  equation  3.26   for  f(t)   and  equation   3.27  for 
F(t)  in  the  equation  3.28,  we  obtain 


A 


kt-K-,  ,  ^r^<T\r_  rU\-\-A 


Mp(t)  =A^H-^T(^^)]^  e<pC§t^3QxC^p-)5  (3-29) 

where  Gy(  )  is  the  standard  normal  distribution  function. 
The  lognormal  distribution  is  positively  skewed  and  the 
level  of  skewness  depends  upon  the  value  of  mean  and 
variance  of  the  distribution.  If  the  value  of  mean  is 
decreased  but  the  variance  is  increased,  then  the  shape  of 
distribution  tends  to  be  more  skewed  and  it  approaches  the 
shape  of  the  exponential  model. 

The  performance  of  the  parametric  estimate  was 
assessed  by  simulation.  In  each  replication  of  the 
simulation,  50  independent  lognormal  service  times  and  50 
independent  exponential  interarrival  times  were  generated. 
The  simulation  generated  two  sets  of  the  service  times.  One 
set  of  service  times  is  from  the  lognormal  distrbution  with 
§  =0.  193  and  <j~  =1  ,  and  the  other  set  is  from  the  lognormal 
distribution  with  §=0.568  and  S  =0. 5.  To  estimate  the  mean 
and  variance   from  the  data,    the  logarithm  of   the  service 
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Figure  3.9    Average  Relative  Bias  of  Mp(t)  for 
Lognormal  with   ^  =.193,  $X=1    at  t=l.   (  n=50,   r=300  ) 


Figure  3.10    Average  Relative  Bias  of  Mp(t)  for 
Lognormal  with  §  =.  568,  <f^=.  5  at  t=l.   (  n=50,   r=300  ) 


time  data,   t;.=ln  X;  ,   i  =  l  to  n  was  computed, 
variance  are  expected  by 


The  mean  and 
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Thus    the    estimate    is 


§  =  -;-  sit 


MP(t)  =^^ti:i-GrYC%Li,^  •v^fC^4\^CJUiy"<r)^    (3.30) 

where  X  is  the  estimate  of   the  arrival  rate.    An  estimate 
based  on  an  erroneous  exponential  model 


Mc(  t)  =VM(1  -  exp[  -t/A]  ) 


(3.31) 


_  v 


<n 


was  also  computed,  where  m=  —  X  x* 


Figure  3.11    Average  Relative  Square  Error  of  Mp(t) 
for  Lognormal  with  §=.193,  £X=l    at  t=l.   (  n=50,   r=300  ) 

The  simulation  was  replicated  300  times.  Figure  3.  8 
shows  the  tendency  of  9,(t),  the  average  relative  bias  of 
Mp(t),  for  the  correct  model  (shown  with  a  solid  line)  and 
the  erroneous  exponential  model  (shown  with  a  dotted  line) 
for  the  simulation  with  ,§=0.193  and  <f  =1  in  Figure  3.9,  and 
with  §=0.568  and  0=0.5  in  Figure  3.10. 
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Figure  3.12    Average  Relative  Square  Error  of  Mp(t) 
for  Lognormal  with  §=.568,  <j^=.  5  at  t=l.   (  n=50,   r=300  ) 

In  both  figures,  the  average  relative  bias  of  the 
exponential  model  is  also  large  initially.  As  expected,  the 
average  relative  bias  of  the  exponential  model  in  Figure 
3.  10  is  larger  than  the  results  in  Figure  3.  9.  As  t  *oo  , 
the  exponential  model  shows  better  performance  and  has  a 
limiting  value.  Figures  3.  11  and  3.  12  show  the  tendency  of 
OJt),  the  average  relative  square  error,  for  the  correct 
model  ( shown  with  a  solid)  and  the  erroneous  exponential 
model  (shown  with  a  dotted  line).  For  Figure  3.11  the 
parameters,  §=0.193  and  cf  =1,  are  used  to  generate  data. 
And  for  Figure  3.12  the  parameters,  f  =0.568  and  <f=0.5,  are 
used.  The  value  of  average  relative  square  error  of  the 
exponential  model  in  Figure  3.  12  is  also  higher  than  the 
results  in  Figure  3.  11. 

D.   SUMMARY 

The  general   conclusions  of   this  chapter   are  that   the 
parametric   estimation  method   is  a   highly   efficient   for 
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obtaining  estimates  of  M(t)  whenever  the  correct  assumption 
for  the  model  is  given.  The  structure  of  the  estimate  of 
M(t)  is  clearly  biased  since  it  is  a  nonlinear  function  of 
the  estimated  parameters  for  the  service  time  distribution 
and  the  customer  arrival  rate.  However,  for  the  service 
time  distributions  considered  the  indications  are  that  the 
bias  is  small.  Hence  the  parametric  estimation  method 
performs  very  well,  whether  or  not  the  estimate  is  corrected 
for  bias,  when  the  correct  parametric  form  is  used.  However 
the  performance  of  the  parametric  estimation  is  very  poor 
when  the  wrong  parametric  model  is  used.  For  instance,  the 
erroneous  exponential  model  often  has  a  high  level  of  bias 
and  mean-squared  error.  Notice  that  the  exponential  model 
converges  to  the  same  limiting  value  as  the  correct  model  as 
t  ->oo  in  all  the  cases  considered.  This  is  because  as  t  -*oo 
all  estimates  use  the  mean  service  time  to  estimate  the 
integral  of  the  servivor  functions. 
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IV.  NONPARAMETRIC  ESTIMATION 

A.   DESCRIPTION 

Nonparametric  methods  are  statistical  techniques  which 
are  applicable  regardless  of  the  form  of  the  distribution 
function  that  the  measurement  comes  from.  In  this  chapter, 
these  techniques  will,  for  the  most  part,  be  based  on  the 
order  statistics. 

Let  x^x-l  ,...,x^  denote  a  random  sample  from  a  CDF  F, 
and  let  su  ,  sLa>  , .  .  .  ,  s^  denote  that  corresponding  order 
statistics.   Then  the  sample  CDF  is  defined  by 

F^(t)  =  —(number  of  s.  less  than  or  equal  to  t) 

»»» 
For  fixed   time  t,    Fn(t)    is  a   statistic  since   it  is   a 

A 

function  of  the  sample.  In  fact,  for  fixed  time  t,  Fn(t) 
has  the  same  distribution  as  that  of  the  sample  mean  of  a 
Bernoulli  random  variable.    We  know  by  the   central  limit 

A 

theorem  that  Fh(t)  is  a  asymptotically  normally  distributed 
with  mean  F(t)  and  variance  (  -^  )F(  t)  [  1-F(  t )  ]  . 

Recall  that  (in  chapter  II)  the  mean  number  of  customers 
being  served  at  time  t,  M(t),  is  a  function  of  the  arrival 
rate  \  and  the  survivor  function  of  the  service  times. 
Hence  a  nonparametric  estimator,  denoted  by  Mw(t),  can  be 
represented  by  the  estimated  values  of  \  and  F(t).  The 
estimated  survivor  function  of  service  time  is 


A 

Fn(t)    =|     1  if   0   *   t   <    sto 

%"  if    lo*    t   <    Vo  for   i  =  l<2,..,n-l 


0  if   t    >,  s 


C^> 
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Now,   using   the  fact   that  MN(t)=  \  \a   F(s)ds,   we   obtain  a 
nonparametric  esimate 


MH(t)  =,  \t 


A  [  --2Ls  +  ----  t] 


^ 


V  AlV£*ci> 


if  0  (  t  <  s 
if  s  ^  t  <  s 

if  t  £  s 


(4.1) 


where   \  is  the   estimated  arrival   rate.     Note   hat   the 

nonparmetric  estimate  has   a  limiting  value  as   t  -?oo  ,   that 

is,  lim  Mo(t)=\m  where  m  is  the  mean  service  time, 
•t-fao 
In   this   chapter,    we   will   consider   two   different 

situations.    In  one  case,   we   will  assume  that  the  arrival 

rate   is  known  but   the  distribution   of   service  times   is 

unknown  and  must  be  estimated.    Based  on   this   assumption, 

Mls)(t)   is  expressed  simply  in   terms  of  the  order  statistics 

of  the  service  times  as  follows 


K 


~k^s     *  "3\-  t] 


(4.2) 


when   s^<t<  s^0. 

distribution  of  M^(t)   in  this  case. 


In  the   Appendix   A,    we  derive   the 

Its  mean  and  variance 


are 


E[M„(t)]    =\5t>F(s)ds  (4.3) 

Var[MKj(t)]    =    ~{   i*sF(ds)    -    [  £  sF(  ds)  ]  2    +    t2F(t)F(t) 

-    2tF(t)[    J0tsF(ds)]  }  (4.4) 


Thus,  M,j(t)  is  an  unbiased  estimate  of  M^i(t).    Further,   as 

A 

the  sample  size  n  is  increased,  M^(t)  is  asymptotically 
normal.  Thus  an  approximate  normal  100(l-o<)%  confidence 
interval  for  Mw(t)  is  given  by 
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M 


g(t)   ±   z  ,.^  I  Var[M  (t)]  (4.5) 


where  z  t_*  is  the  upper  1-  ^  point  of  standard  normal 
distribution  and  Var[M^(t)]  is  given  in  Appendix  A.  In  this 
chapter,  we  will  also  study  the  jackknife  and  the  bootstrap 
procedures  for  obtaining  confidence  intervals  for  M^(t)  in 
the  case  in  which  \  is  known.  In  the  second  case,  we  will 
assume  that  the  arrival  rate  X  is  also  unknown  and  must  be 
estimated.  Then,  a  nonparametric  estimate  M^(t)  is  the 
product  of  two  estimates, 
^         a 

*      *   \  K     nv-K  , 

JWt)  =\[»ls   +  -™t]  (4.6) 

*    a       * 

where  \=n/-=£y.  and  K  is  the  number  of  service  times  that 
are  less  than  or  equal  to  t.  There  are  no  exact  functional 
forms  for  the  mean  and  variance  of  M,4  (t)  in  this  case. 
However,  the  jackknife  and  the  bootstrap  methods  can  be  used 
to  obtain  confidence  intervals.  This  will  be  described 
below. 

B.   JACKKNIFE  ESTIMATION  METHOD 

In  this  section,  we  will  study  the  jackknife  procedure 
for  obtaining  a  confidence  interval  for  M ^  (t).  The 
jackknife  was  first  introduced  by  Quenouille  (1949)  for  the 
purpose  of  reducing  the  estimate  bias,  and  the  procedure  was 
later  utilized  by  Tukey  (1958),  to  develop  a  general  method 
for  obtaining  approximate  confidence  intervals  [ Ref s.  7,8]. 

The  basic  idea  of  the  jackknife  estimation  method  is  to 
assess  the  effect  of  each  of  the  groups  into  which  the  data 
have  been  divided,  not  by  the  results  for  that  group  alone, 
but  rather  through  the  effect  upon  the  body  of  data  that 
results  from  omitting  that  group.  The  two  bases  of  the 
jackknife  are  that  we  make  the  desired  calculation  for  all 
the  data,  and  then,  after  dividing  the  data  into  groups,  we 
make   the  calculations   for   each   of  the   slightly   reduced 
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bodies  of  data  obtained  by  leaving  out  just  one  of  the 
groups.  A  special  case  of  Jackknife  estimation  is  called 
the  "complete   jackknife  estimation",    where  the   number  of 

XL 

subgroups  is  n  (the  size  of  sample);  the  i  subgroup  is 
obtained  by  deleting  the  i"^  observation;  thus  the  size  of 
each  subgroup  is  n-1  [ Ref .  9].  Attention  will  be  restricted 
to  complete  jackknife  estimation  in  this  study. 

Let  M^.^t)   be   the  estimated   mean  number   of  customers 
being  served   at  time  t   on  the   portion  of  the   sample  that 
omits  the   i  ^  sample.     Let  M    (t)   be   the  corresponding 
estimator   for   the    entire   sample   and  define    the   i"^ 
Pseudo-value  by 

M.(t)  =  nMal,(t)  -  (n-l)M^(t)  (4.7) 

<\  A  J. 

The  jackknife  estimate  M3  (t)  and  an  estimate  S  of  its 
variance  are  given  by 

M  (t)  =  --  51  M4(t)  (4.8) 


S*  = Z.  [Mi(t)  -  M3(t)]2  (4.9) 

Tukey  (1958)  proposes  that  the  n  estimated  pseudo  values  be 
treated  as  approximately  independent  and  identically 
distributed  random  variables  [Ref.  9].   Hence,  the  statistic 

A  A 


Jv\   C  M3cb  -  rt^tt) ) 


(4.10) 
has   an   approximate   t-distribution  with  n-1   degrees   of 
freedom,    which    leads   to    the   approximate    100(1- <^  )% 
confidence  interval 

M3(t)   ±   t  i_^  S,  (4.11) 
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where  t  v-*  is  the  upper  1-  ^_  critical  point  of  the 
t-distribution  with  n-1  degrees  of  freedom.  The  confidence 
interval  given  by  equation  4.  11  is  a  function  of  the 
estimated  variance.  In  the  remainder  of  this  section,  we 
will  describe  several  methods  of  implementing  the  confidence 
interval  procedure.  We  will  also  obtain  an  analytic 
expression  for  the  jackknife  estimate  and  its  variance 
estimate  for  the  case  in  which  the  arrival  rate  \  is  known. 
1.   Jackknife  Estimate  with  Known  Arrival  Rate 

In  this   subsection,   the   arrival  rate   is  asssumed 
known.    In  this   case  the   closed  form   expression  for   the 
jackknife  estimate  and  its  variance  estimate  can  be  derived. 
The   nonparametric  estimate   of  the   mean  number   of 

A 

customers  being  served  at  time  t,  M^/t),  can  be  expressed  in 
terms  of  the  service  times  as  follows: 

Mg(t)  =  U-J-Z.S,.  +  -«-t]  (4.12) 

where  S>' s  are  the  order  statistics  of  independent  and 
identically  distributed   random  quantities  from   the  unknown 

A 

probability  distribution  F,  and  the  variable  K  is  the  number 
of  S^ ' s  which  are  less  than  t.  This  equation  shows 
immediately  that  M^(t)  is  the  linear  function  of  the  order 
statistics  of  service  times. 

The  jackknife  estimate  is  based  on  sequentially 
deleting  point  S^  and  recomputing  the  estimator.  Removing 
point  S^;.  from  data  set  gives  a  different  empirical 
probability  distribution  F^.^  with  mass  ■—■  at  SC)  >  ,  Sj^ 
,..  ,  SI ;  ,S  .  /S  and  a  corresponding  recomputed  value  of 
the  estimate.  In  the  jackknife  process,  the  \^  pseudo 
value  is 

A  A 

M;(t)  =f  \t     if  i  >  K 

.  \stf>    if  i  $  K  (4.13) 
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for  the  fixed  time  t.    Accordingly   the  pseudo  values  M-(t) 

A 

have  just  K+l  different  values.   The  jackknife  estimate  is 

A  A 

M3(t)  =  \C^!^-  "-^-t]  (4.14) 

This  result   is  exactly  the   same  as  the   original  estimate. 
This  is  because  the  estimate  MT(t)  is  unbiased.   In  Appendix 
B,  the  jackknife  variance  estimate  is  derived  as  follows: 

*  A  A  A 

V*r[M,(t)]  =  -ir{J.|^-  2tF^(t)[-;-|^  -  [J-Jbl« 

- 

+  ^FJtJFJt)!  (4.15) 


A 


where  Frv(t)   is  the   sample  survivor   function.    Comparing 

equation  4.15  with   equation  4.3  we  see    that  (----)[Var  (Mu 

a  "* 

(t)]]  =  E[  Var  {M_(  t) }  J  .   Thus  the  jackknife  variance  estimate 

tends  to  be   conservative  in  the  sense   that  its  expectation 

is  greater   than  the  true  variance   of  M^(t).    We   will  now 

describe   two   selected  procedures    to   obtain   confidence 

intervals  for  the  jackknife  estimate.    Tukey  suggested  that 

the    statistic   in   equat:  n   4. 10   has   an    approximate 

t-distribution  with  n-1  degrees  of   freedom,   which  leads  to 

the  approximate  two-sided  100( l-o()%  confidence  interval 

M3(t)   ±   t  ,_«J  Var[M3(t)]  (4.16) 

for  MKj(t)/  where  t  ,_a_  is  the  upper  1-^.  critical  point  of 
the  t-distribution  with  n-1  degrees  of   freedom.    However, 

A 

the  n  estimate  pseudo  values  have  just  K+l  different  values. 
Hence,   another  possible  procedure  is   to  adjust  the  degrees 
of  freedom  of  the  t-distribution,  that  is,  subtract  one  from 

A 

the  number  of  different  pseudo  values  (K+l),  and  use  the 
result  as  the  degrees  of  freedom.  The  length  of  confidence 
interval  generated  using  the  adjusted  degrees  of  freedom  (K) 
is   slightly   wider   than  that   generated   using   the   usual 
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degrees  of   freedom  (n-1)   and   the  coverage  rate   should  be 
increased. 

2.   Jackknife  Estimate  with  Unknown  Arrival  Rate 

In  this  subsection,  it  will  be  assumed  that  the  rate 
of  Poisson  arrival  process  is  unknown  and  must  also  be 
estimated.  The  maximum  likelihood  estimate  is  \=n/|/;, 
where  y;  is  the  interarrival  time  between  i  and  (i-1) 
customers.  A  nonparametric  estimate  of  mean  number  of 
customers  being  served  at  time  t  is  given  by 


A 


IWt)  -M-;-£^  ♦  ----t!  (4.17) 


where  S^'s  are  the  order  statistics  of  independent, 
identically  distributed  random  quantities  from  the  unknown 
probability  distribution  F.  It  is  assumed  that  the  S(;j  s  and 
Y^ ' s  are  independent.  The  variable  K  is  the  number  of  S  ' s 
which  are  less  than  t.  The  data  consist  of  two  independent 
random  samples, 

S,,Sa, S^  ~  F  and  Y,  ,YX/...  ,YA  -  Q 

F  and  Q  being  two  possibly  different  distribution  on  the 
real  line  with  Q,  the  exponential  distribution  with  mean  jr  . 
From  equation  4.17,  the  estimate  M^(t)  is  the  product  of  two 
estimates.  One  is  the  function  of  y^  ,  \  =  'W'S"«/  and  the 
other  is  the  function  of  s,  ,  H(  s)  =  '^r  X  s-  +  — ~  t.  there  are 
many  possible  ways  to  perform  a  two-sample  jackknife 
procedure.  We  will  call  one  method  the  "paired  sample 
jackknife"  procedure.  Since  the  size  of  both  samples  is  the 
same,  we  make  the  one  set  of  observations  by  pairing 
respective  observations,   that  is,   ( s{ ,y  ) , ( sx ,v) ,...,( a L/Y 

A 

).   As  with  the  one-sample  jackknife,  we  estimate  the  M^u(t) 
for  all  the  data,  and  then,  we  estimate  Mm.y(t)   based  on  the 
remaining  data   obtained  by   leaving  out   just  the   i   pair. 
Thus  the  i  M  pseudo  value  M^(t)  is 

M,(t)  =  nMft„(t)  -  (n-l)M^(t)  (4.18) 
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A 

and  the  jackknife  estimate  Mj(t)   and  variance  estimates  are 
given  by 

M3(t)  =  «^M,(t) 


A  a.  J     Ji   A,  a 

S,  =  —  -—  X[M-(t)  -  M  ft)]2 
Based   on  these   statistics,   an   approximate  two-sided   100 
(  l-<\)%  confidence  interval  is  given  by 

M  (t)   ±   tH*  S_  (4.  19) 

a.   -i 

where  t  >_*   is  the  upper   1-  f£  point  of   t-distribution  with 

a. 

n-1  degrees  of  freedom.  A  second  method  is  called  the 
"separated  sample  jackknife"  procedure.  Since  we  assumed 
that  the  X^'s  and  Y^ ' s  are  independent,  we  can  perform  the 
jackknife  procedure  separately  for  each  sample,  and  then, 
estimates  which  combine  jackknife  estimates  and  the 
jackknife  variance  estimate  can  be  computed. 

Let  M3.(t)   be  the  jackknife  estimate  of  \  and  Vy    be 

the  jackknife  variance  estimate  for  \  .    Let  M3a(t)  be  the 

ft  —  * 

jackknife   estimate  of   }  F(s)ds  and  Vs   be  its   jackknife 

variance  estimate.    Then  the  combined  jackknife  estimate  of 

MJC(t)  is 

MJc(t)  =  M3l(t).M3a(t)  (4.20) 

and  the  combined  jackknife  variance  estimate  is 

Stc  =  \'V*    +  V^b^t)]2  +  V^M^t)]2  (4.21) 

The  approximate  two-sided  100  (  l-o()  %  confidence  interval  is 
given  by 

MJC(t)   ±   t  ,.a  SJC  /fn"  (4.22) 
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where  t  t_£  is  the  upper  1-^  point  of  t-distribution  with  n-1 
degrees  of  freedom. 

C.   BOOTSTRAP  ESTIMATION  METHOD 

Efron(1979)  introduced  the  bootstrap  method  for 
estimating  the  distribution  of  a  statistic  computed  from 
observations  [ Ref .  10].  The  bootstrap  estimate  is  obtained 
by  replacing  the  unknown  distribution  by  the  empirical 
distribution  of  the  data  in  the  definition  of  the 
statistical  function.  In  practice,  the  distribution  of  the 
statistic  is  approximated  by  Monte  Carlo  methods. 

For  convenience,  the  arrival  rate  is  assumed  to  be  known 

and  equal  to   1,   then  the  nonparametric  estimate   M^(t)   is 

just  a   function  of   service  times.     This  is   a  one-sample 

problem.   The  bootstrap  procedure  is  as  follows: 

1.  Suppose  that  the  data  points  Xi  ,Xi_  , ...,xn  are 
independent  observations  from  the  unknown  distribution 
F.   Then  the  true  estimate  is 

*Mt)  =  £F(s)ds  (4.23) 


We  can   estimate  the  distribution   F  by   the  empirical 
probability  distribution  Fn  . 
Fh  :  mass  -^  on  each  observed  data  point  x- 
1=1/2, ,n 

The  bootstrap  estimate  of  M^t)  is 
M8(t)  =  £  Fn(s)ds  (4.24) 


To  obtain  an  estimate  of  variability   for  Me(t),   we  procede 
as  follows 

(1)  Construct  Fn  (       the   empirical  distribution  function, 
as  just  described. 

(2)  Draw   a   bootstrap   sample    x*      ,  x*    , .  .  .  ,  x^    by 
independent  random  sampling  from  Fn . 

Notice  that  we  are  not   getting  a   permutation  distribution 

since  the   values  of  X-   are  selected  with   replacement  from 
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the  set  ( xv ,xx, . . . ,x~  ).  As  a  point  of  comparison,  the 
ordinary  jackknife  can  be  thought  of  as  drawing  samples  of 
size  n-1  without  replacement. 

(3)  Compute  an  estimate  of  M ki  (t)  for  each  bootstrap 
replication,  Mw(t),  that  is,  the  value  of  statistic 
evaluated  for  the  bootstrap  sample. 

M*(t)  =  >"^tx^ +  -k[n'i^K^^]t  (4-25> 

where  I(x^t)=\l    if  xU 

\0    otherwise 

(4)  Do  step   (2)   some  large   number  "B"   times  obtaining 
independent  bootstrap     replications   M"*.1  (t),    W"' 
(t),F..,  M*?(t).      *  ^ 

Based   on   the   bootstrap    replications,    the   approximate 

estimate  of  M^(t)  and  its  variance  are  obtained  by 

^*(t)=  i_5M^(t)  (4-26) 


Var[M6(t)]  =  ---3-[M^(t)  -  M^t)]1  (4.27) 

B-\  *=t 

A  formula  for  the  conditional  variance  of  MR(t)  given  the 
original  sample  data  is  derived  in  Appendix  C.  This 
expression  is  given  by         *,  , 

Var[MB(t)J  =  -i;l-Jr|«f  -  [^|^a  +  t«[  -«-]  A 

-2t[-^-][--?xJ  }  (4.28) 

Notice  that  the  expected  value  of  the  conditional  variance 
of  the  bootstrap  estimate  is  approximately  equal  to  the 
variance  of  the  nonparametric  estimate  of  M^t)  which  is 
derived  in  Appendix  A. 

So  far  we  have  considered  the  problem,  where  the  arrival 
rate  is  known.  The  bootstarp  methodology  also  applies  if 
the   arrival    rate   is   unknown    and   is    estimated   from 
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interarrival  data.  Suppose  the  data  consist  of  a  random 
sample  X=(X(  ,  X^  , ...,X  )  from  unknown  service  time 
distribution  F  and  an  independent  sample  Y=( Y,  ,YX  ,...  ,Y^) 
from  the  exponential  interarrival  time  distrbution  G  with 
unknown  parameter  X  .  One  bootstrap  procedure  to  estimate 
the  expected  number  of  customers  being  served  at  time  t  is 
to  construct  F^  and  G^  ,  the  empirical  probability 
distribution  corresponding  to  F  and  G.  Bootstrap  samples 
X*^  F^  ,  i=l/2/...n/  Y*~  Gn,  j=l,2/...n/  are  independently 
drawn,  an  estimate  of  Mw(t) 


mJ( 


t)  =  -J^-'-i'-  X  x*  +  -Mn  -  2  I(xf<t)]t)        (4.29) 


is  calculated.  As  before  there  are  a  large  number  B  of 
bootstrap  replications.  For  this  case,  the  bootstrap 
estimate  of  M^t)  and  its  variance  are  still  given  by 
equations  4. 26  and  4. 27.   There  appears  to  be  no  closed  form 

A. 

of  the  analytical  variance  of  MB  (t)  in  this  case.  Now  we 
will   describe   methods   to   obtain   approximate   confidence 

A, 

intervals  for  the  bootstrap  estimate  MB(t). 
1.   The  Percentile  Method 

A  simple  method  for  assigning  approximate  confidence 
intervals  to  the  nonparametric  estimate  M^t)  is  as  follows: 
Let 

C(t)  =  -l--H---~-tL-  (4.30) 

6 

be  the  cumulative  distribution  function  of  the  bootstrap 
distribution  of  M^(t);  B  is  the  number  of  bootstrap 
replications.   For  a  given  0<o<<0.5,  define 

L(o<)  =  C*'(oO     ,     U(o<)  =  c"(l-o() 
Usually  denoted   simply  by  L   and  U.    This   definition  runs 
into  complications  when  we  actually  try  to  compute  quantiles 

A  A 

L  and  U  from  a  set   of  bootstrap  replications.    To  overcome 
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these  difficulties,  we  order  the  bootstrap  replications  from 
smallest  to  largest,  obtaining  the  sorted  data  M ^  (t),  for 
i=l  to  B.  Letting  represent  any  fraction  between  0  and 
1;  take  Q(o<.)  to  be  M  ^  (t)  whenever  Q  is  one  of  the 
functions  °^;  =  — g-2-  ,  for  i=l  to  B.  Thus  L(o<)  turns  out  to 
be  the  (B  *X+  0.5)+K  M^°(t)  and  UK)  to  be  the  (B  *  (  l-°<) 
+0.5)   M  ^  (t).   The  percentile  method  consists  of  taking 

[  L(o<)  ,  U(o<)  ]  (4.31) 

A 

as  an  approximate  l-2o(  confidence   interval  for  M6(t)   since 

A  A 

°<=C(L),  l-o<=C(U),  the  percentile  method  interval  consists 
of  the  central  1-2  c<  proportion  of  the  bootstrap 
distribution. 

2.   The  Bias-corrected  Percentile  Method 

Efron(1980)   suggests  the   following  bias  correction 
for  the  percentile  confidence   interval  procedure  [ Ref .  11]. 

A 

He  argues  that  if  MQ(t)  is  not  the  median  of  the  bootstrap 
replication  distribution,  then  a  bias  correction  to  the 
percentile  method  is  called  for.   To  be  specific,  define 

z0  =  i"l[C(Ma(t))]  (4.32) 

where  C(t)=  ~ as  in  equation   4.30,   and     is  the 

cumulative  distribution  function  for  a  standard  normal 
variate.  The  bias  corrected  percentile  method  consists  of 
taking 

[C^{   £(2z0-  z^)}   ,   C~*{£(2z0  +  Za<)}]  (4.33) 

as  an  approximate  1-2  o(  central   confidence    interval    for 

A 

Mft(t).  Here  z^  is  the  upper  point  for  a  standard  normal 
$_(zx)=l-o(. 
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A 

Notice  that  if  MB(t)   is  the  median  of  the  bootstrap 
distribution  then  zo=0  and  equation  4. 33  reduces  to  equation 


4.31,  the  uncorrected  percentile  interval.  However,  even 
small  differences  of  Pr{MKj(t)£  MB(t)j  from  0.5  can  make 
equation  4.  33  much  different  from  equation  4. 31. 
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V.  SIMULATION  RESULTS 

The  purpose  of  the  simulation  in  this  chapter  is  to 
assess  the  performance  of  the  nonparametic  estimation 
methods,  the  jackknife  and  the  bootstrap.  Since  the 
estimate  of  M(t),  the  mean  number  of  customers  being  served 
at  time  t,  is  a  function  of  the  customer  arrival  rate  and 
the  integral  of  the  survivor  function  of  the  service  time 
distribution,  two  simulations  cases  are  done.  The  first 
simulation  case  was  performed  to  estimate  MNj(t),  the 
nonparametric  estimate  of  M(t),  as  a  function  of  the  service 
times  with  the  arrival  rate  assumed  to  be  known  and  set 
equal  to  1.  For  this  case,  the  jackknife  and  bootstrap 
estimate  of  the  variance  were  derived  in  the  chapter  IV,  and 
compared  with  the  numerical  estimate  obtained  by  the 
simulation.  The  second  case  assumed  that  the  customer 
arrival  rate  is  also  unknown  and  must  be  estimated  using 
interarrival  times. 

In  each  replication  of  the  simulation  for  case  1,  50 
independent  service  times  from  a  specified  service  time 
distribution  were  generated.  For  the  bootstrap  procedure, 
500  bootstrap  replications  were  performed.  The  simulation 
was  replicated  300  times.  For  the  purposes  of  comparison, 
we  considered  four  types  of  service  time  distributions, 
which  were  the  exponential,  the  mixed  exponential,  the 
gamma,  and  the  lognormal  distribution.  The  arrival  process 
is  known  to  be  Poisson  process  with  known  rate  \=1.  The 
same  generated  service  times  were  used  for  each  estimation 
procedure  in  a  replication.  This  reduces  the  variability  of 
the  differences  in  performance  between  the  procedures.  All 
programming  was  done  on  IBM  3033  computer  at  the  Naval 
Postgraduate  school  using  the  LLRANDOMII,  random  number 
generating  package  [ Ref .  6]  . 
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TABLE  III 

STATISTICAL  DATA  OF  ESTIMATE  OF  M(T) 
N=50  R=300  (B=500) 


True 
value 

Parametric 

Nonparametric 

Correct 

Errors 

Jack 

Boot 

Exponential 
(  M  =  2) 

0. 7869 

0. 7830 
(0.  0268) 

- 

0. 7820 
(0.  0451) 

0. 7819 
(0.  0452) 

Mixed  expon. 
^=2,^=.  75,f,=.  i 

0. 5992 
I) 

0. 5871 
(0.  0023) 

0. 6246 
(0.  0026) 

0. 5991 
(0.  0512) 

0. 5989 
( 0. 0512) 

Gamma 
(  /J  =  l,  K=2) 

0. 8963 

0. 8952 
(0.  0009) 

0. 7861 
(0.  0010) 

0. 8965 
(0.  0325) 

0. 8967 
(0.  0325) 

Lognormal 
(§=.193,  j*=l) 

0. 8094 

0.  8140 
( 0. 0021) 

0. 7844 
(0.  0020) 

0. 8139 
(0.  0383) 

0. 8138 
(  0. 0383) 

Table  III  presents  the  results  of  several  estimation 
methods  when  the  arrival  rate  is  given  and  equal  to  1.  The 
top  of  each  cell  gives  the  mean  estimate  of  M(t)  at  time 
t=l,  where  M(  t)=  [o  F(  s)ds.  The  bottom  part  of  each  cell 
gives  the  standard  deviation  of  the  estimate.  For  service 
time  distributions  other  than  exponential,  a  parametric 
estimate  based  on  an  erroneous  exponential  model  is  also 
given.  The  estimate  in  the  case  of  an  exponential  model  is 
[ 1-EXP [ -t/u] ]  where  M  is  the  mean  service  time.  For  each 
service  time  distribution,  the  standard  deviation  of  the 
parametric  estimate  of  M(t)  is  smaller  than  that  of  the 
nonparametric  estimate  of  M(t).  That  is,  the  efficiency  of 
the  parametric  estimation  method  is  better  than  the 
efficiency  of  the  nonparametric  estimation  method.  However, 
the  results  of  a  parametric  fit  assuming  an  erroneous 
exponential  model  show  the  worst  performance.  The  true 
value  of  M(t)  is  not  included  within  plus  or  minus  three 
standard  deviations  of  the  erroneous  estimate  M(t).  In  the 
table,  the  nonparametric  estimation  methods  seem  to  perform 
well  in  all  cases  with  the  cost  of  an  inflation  of  variance. 
Hence  the  nonparametric  estimation  method  is  to  be  preferred 
when  the  service  distribution  is  unknown. 
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To  illustrate  the  efficiency  of  the  nonparametric 
estimation  methods,  we  simulated  two  possible  ways  to 
construct  the  approximate  confidence  interval  for  M^(t)  for 
the  bootstrap  and  the  jackknife  methods.  Those  ways  are 
presented  in  chapter  IV.  For  the  jackknife  estimation 
method,  one  procedure  was  to  construct  the  confidence 
interval  with  the  regular  degrees  of  freedom,  n-1,  and  the 
other  used  the  reduced  degrees  of  freedom,  which  is  the 
number  of  different  pseudo  values.  For  the  bootstrap 
estimation  method,  one  way  used  the  percentile  method  by  the 
Monte  Carlo  process,  and  the  other  used  the  bias-corrected 
percentile  method;  there  were  500  bootstrap  replications. 
Nominal  68%,  80%,  and  90%  confidence  intervals  were 
constructed  for  each  replication  using  each  method.  It  was 
noted  whether  the  confidence  interval  formed  by  a  given 
method  covered  the  true  value  M(t).  The  entire  process  was 
independently  replicated  with  R=300  times.  From  these  R 
replications  we  computed,  for  each  method,  the  proportion  p 
of  the  R  confidence  intervals  which  contained  M(t),  as  well 
as  the   average  length  of   the  confidence  intervals.     If  a 

A 

method  was  performing  adequately,  p  should  be  near  l-o<  ,  and 
a  small  mean  length  is  desirable. 

Tables  IV  to  VII  show  the  simulation  results  of  several 
confidence  interval  procedures  for  four  types  of  service 
time  distribution;  the  exponential,  the  mixed  exponential, 
the  gamma,  and  the  lognormal.  The  arrival  process  is 
Poisson  with  known  arrival  rate  \=1. 

In  order  to  compare  the  performance  of  these  procedures 
to  the  normal  confidence  interval  procedure,  simulations 
were  conducted,  and  nominal  68%,  80%,  and  90%  confidence 
limits  were  constructed  for  time  t=l  for  each  replication. 
The  normal  confidence  interval  procedure  is  based  on  the 
order  statistics  of  the  service  times.  By  the  central  limit 
theorem,  the  distribution  of  M^(t)   is  asymptotically  normal 
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TABLE  IV 

COVERAGE  AND  LENGTH  OF  100(  l-<*)%  C.  I 

FOR  EXPONENTIAL   WITHu=2,  \  =1   AT  T=l 


68  ? 

0 

80  % 

90  % 

Length 
(s.d) 

C.  R. 

Length 
(s.d) 

C.  R. 

Length 
(s.d) 

C.  R. 

Normal  C.  I. 
Procedure 

0. 0888 
(0.  0089) 

19.  33 
69.  33 
11.  33 

0. 1147 
(0. 0101) 

12.  00 

83.  67 

4.  33 

0. 1477 
(0.  0141) 

4.  67 

92.  00 

3.  30 

Jack- 

Reduced 
d.  f 

0. 0911 
(0.  0089) 

17.  67 
71.  33 
11.  00 

0. 1187 
(0.  0102) 

11.  00 

85.  33 

3.  67 

0. 1548 
(0.  0142) 

3.  33 

94.  00 
2.  67 

knife 

Regular 
d.  f 

0. 0897 
(0. 0090) 

18.  38 
70.  67 
11.  00 

0. 1163 
(0.  0103) 

11.  33 

84.  33 

4.  33 

0. 1505 
(0.  0144) 

4.  00 

93.  00 

3.  00 

Boot- 

Percen- 
tile 
method 

0. 0884 
(0.  0098) 

18.  67 
69.  00 
12.  33 

0. 1132 
(0.  0112) 

12.  00 

82.  00 

6.  00 

0.  1462 
(0. 0154) 

4.  33 

92.  00 

4.  67 

strap 

Bias- 
correct 
method 

0. 0887 
(0.  0098) 

16.  67 
71.  00 
12.  33 

0. 1137 
(0.  0112) 

10.  67 

83.  00 

6.  33 

0. 1467 
(0.  0154) 

2.  67 

92.  67 

4.  67 

distributed  as  the  number  of  data  points  n  -*co  .    Thus,   the 
100(  1-<X)%  normal  confidence  interval  is  given  by 


M^t)   ±   zl.iyVar[MV4(t)] 


(5.1) 


where  z  ._£.  is  the   upper  1-^  point   of  the   standard  normal 

a. 

distribution  and  Var[M^(t)]   is  given  by  equation  A.  14  in 
appendix  A. 

Each  cell  in  the  tables  contain  the  average  and  standard 
deviation  of  confidence  interval  length;  and  the  proportion 
of  intervals  that  are  too  high,  (e.g.  M  (t)<L),  where  L  is 
the  lower  bound  of  interval;  the  proportion  of  intervals 
covering  the  true  value  M(t),  p;  the  proportion  of  interval 
that  are  too  low,  (e.g.  M(t)>U),  where  U  is  the  upper  bound 
of  interval.  Table  IV  is  for  the  exponential  service  time 
case  with  with  M  =2 ;  Table  V  is  for  the  mixed  exponential 
service  time  case  with  M(=2,  A^0.  75,   and  Pj=0.2;   Table  VI 
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TABLE  V 

COVERAGE  AND  LENGTH  OF  100(l-°<)%  C.  I.  FOR 

MIXED  EXPONENTIAL  WITH  u,  =2 ,  ,Ux=.  75,  P(=.  2,  \  =1   AT  T=l 


68  % 


Length 

(s.d) 


80  % 


Length 

(s.d) 


90  % 


Length 
(s.d) 


C.  R. 


Normal  C. I 
Procedure 


0. 0914 
( 0. 0084 


14.  67 
71.  67 
13.  67 


0. 1169 
(0.  0102 


10.  00 

83.  00 

7.  00 


0. 1509 
(0.  0143 


5.  00 

90.  67 

4.  33 


Jack- 
knife 


Reduced 
d.  f 


0. 0936 
(0.  0085 


14.  00 
73.  00 
13.  00 


0. 1208 
( 0. 0103 


9.  67 

83.  33 

7.  00 


0. 1581 
(0.  0143 


4.  33 

91.  67 

4.  00 


Regular 
d.  f 


0. 0923 
( 0. 0085 


14.  67 
72.  00 
13.  33 


0. 1185 
(0. 0103 


10.  00 

83.  00 

7.  00 


0. 1538 
(0.  0145 


4. 

91. 
4. 


33 
33 
33 


Boot- 
strap 


Percen- 
tile 
method 


0. 0911 
(0.  0094 


14.  33 
71.  67 
14.  00 


0. 1156 
(0.  0112 


11.  33 

81.  33 

7.  33 


0. 1490 
(0.  0153 


4.  33 

89.  67 

4.  33 


Bias- 
correct 
method 


0. 0913 
( 0. 0094 


14.  00 
71.  00 

15.  00 


0. 1161 
(0.  0112 


9.  00 

83.  00 

8.  00 


0. 1495 
( 0. 0153 


4.  00 

89.  67 
6.  33 


is  for  the  gamma  service  time   case  with  3=1  and  K=2;   and 
Table   VII   is  for   the   lognormal   service  time   case   with 
§  =0.  193  and  ^=1. 

The  overall  examination  of  the  tabulations  of  confidence 
limit  coverage  and  also  the  average  and  standard  deviation 
of  confidence  interval  length  suggest  that  the  bootstrap 
procedure  is  slightly  better  than  the  jackknife  procedure; 
however,  the  difference  is  negligible.  The  normal 
confidence  interval  is  also  about  the  same  as  the  jackknife 
and  bootstrap  procedures  indicating  that  a  sample  size  of  50 
is  large  enough  for  the  central  limit  theorem  approximation 
to  be  adequate.  All  procedures  produce  almost  the  same 
average  length  of  confidence  interval  with  a  good  coverage 
rate,  which  falls  within  ±  2/*°'°°  of  1-  °<  .  Although  the 
method  of  the  reduced  degrees  of  freedom  used  in  the 
jackknife  and  the  bias-correct  percentile  method  applied  in 
the  bootstrap  improved  the  coverage   rate,   the  variance  was 
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TABLE  VI 

COVERAGE  AND  LENGTH  OF  100(1-*)%  C.I. 
FOR  GAMMA  WITH  (3=1,   K=2 ,  \=1      AT  T=l 


68  % 

80  ? 

> 

90  % 

Length 
(s.d) 

C.  R. 

Length 
(s.d) 

C.  R. 

Length 
(s.d) 

C.  R. 

Normal  C.  I. 
Procedure 

0. 0595 
(0.  0100) 

22.  00 

68.  33 

9.  67 

0. 0774 
(0.  0120) 

12.  33 

80.  33 

7.  33 

0. 0989 
( 0. 0174) 

10.  67 

86.  33 

3.  00 

Jack- 

Reduced 
d.  f 

0. 0617 
(0.  0102) 

21.  00 

70.  00 

9.  00 

0. 0813 
(0.  0122) 

11.  33 

82.  67 
6.  00 

0. 1062 
(0.  0178) 

8.  33 

89.  00 

2.  67 

knife 

Regular 
d.  f 

0. 0601 
(0.  0102) 

21.  67 

69.  33 
9.  00 

0. 0785 
(0.  0121) 

12.  33 

80.  67 

7.  00 

0. 1008 
(0.  0177) 

10.  33 

86.  67 

4.  00 

Boot- 

Percen- 
tile 
method 

0. 0589 
(0.  0091) 

21.  33 

69.  97 
9.  00 

0. 0766 
(0.  0122) 

12.  00 

80.  33 

7.  67 

0. 0981 
(0.  0179) 

9.  33 

86.  67 

3.  00 

strap 

Bias- 
correct 
method 

0. 0595 
(0.  0100) 

19.  33 
69.  33 
11.  33 

0. 0777 
( 0. 0121) 

10.  67 

81.  00 

8.  33 

0. 0990 
(0. 0180) 

8.  67 

87.  00 

4.  33 

inflated.  Furthermore,  the  amount  of  improvement  was  small 
and  not  significant.  Hence,  the  original  procedures  for 
constructing  the  confidence  interval  for  the  jackknife  and 
bootstrap  are  preferred  in  this  case.  Note  that  the 
coverage  rates  are  skewed  left  slightly  but  almost  balanced. 
It  is  a  reason  that  the  normal  confidence  interval  procedure 
performs  well. 

Results  will  now  be  reported  for  the  simulation  of  the 
case  in  which  the  arrival  rate  of  the  Poisson  process  is 
also  unknown  and  must  be  estimated  from  interarrival  time 
data.  More  computations  are  required  for  this  case; 
however,  the  procedure  is  same.  Each  replication  of  the 
simulation  generated  50  independent  service  times  and  50 
independent  exponential  interarrival  times  having  mean  1. 
Confidence  intervals  were  computed  using  both  separated  and 
paired  jackknife  procedures  and  the  percentile  method  for 
the  bootstrap.     The  number   of  bootstrap   replications  was 
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TABLE  VII 

COVERAGE  AND  LENGTH  OF  100(1-°^)%  C.I. 
FOR  LOGNORMAL  WITH   §=.193,  <$•*•=!,  \  =1   AT  T=l 


68 


Length 
(s.d) 


R. 


80  % 


Length 
(s.d) 


C.  R. 


90 


Length 
(s.d) 


C.  R. 


Normal  C.  I 
Procedure 


0. 0769 
( 0. 0075) 


16.  67 
71.  00 
12.  33 


0. 1007 
( 0. 0096) 


10.  00 
78.  00 
12.  00 


0. 1272 
( 0. 0126) 


6.  67 

89.  33 

4.  00 


Jack- 
knife 


Reduced 
d.  f 


0. 0787 
( 0. 0076) 


16.  33 
71.  33 
12.  33 


0. 1208 
( 0. 0097) 


9.  67 
79.  67 
10.  67 


0. 1330 
(0.  0127) 


6.  33 

89.  67 
4.  00 


Regular 
d.  f 


0. 0763 
(0. 0076) 


16.  67 
70.  33 
13.  00 


0. 1021 
( 0. 0097) 


10.  00 
79.  00 

11.  00 


0. 1297 
(0.  0128) 


6.  67 

89.  33 

4.  00 


Boot- 
strap 


Percen- 
tile 
method 


0. 0763 
(0. 0084) 


16.  67 
69.  33 
14.  00 


0. 0998 
(0.  0105) 


10.  33 
77.  00 
12.  67 


0. 1258 
(0.  0133) 


7.  33 

88.  67 

5.  00 


Bias- 
correct 
method 


0. 0768 
(0.  0084) 


16.  00 
68.  33 
15.  67 


0. 1004 
(0.  0106) 


8.  67 
78.  33 
13.  00 


0. 1263 
(0.  0133) 


6.  33 

88.  67 

5.  00 


1000.  Nominal  68%,  80%,  and  90%  confidence  limits  were 
computed  for  each  replication.  The  simulation  was 
replicated  300  times. 

Tables  VIII  to  X  report  the  results  of  the  simulation. 
The  quantities  in  the  left  part  of  each  cell  are  the  average 
and  standard  deviation  (within  parenthesis)  of  coverage 
interval  length.  The  right  part  of  each  cell  contains  three 
quantities;  the  top  value  is  the  proportion  of  intervals 
that  are   too  high;   the  center   value  is  the   proportion  of 

A. 

intervals  that  cover  the  true  value,  p;  and  the  bottom  part 
is  the  proportion  of  intervals  that  are  too  low. 

In  Table  VIII  (the  case  of  68%  C.I. ),  the  average  length 
from  the  bootstrap  shows  outstanding  performance  with  a 
small  value  of  standard  deviation.  The  paired  jackknife 
procedure  performs  as  well  as  the  bootstrap  procedure.  This 
procedure  reduced  the  standard  deviation  by  more  than  half 
of   that  in   the  separated   jackknife   procedure,   and   also 
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improved  the  coverage  rate.  From  the  results  of  coverage 
rate  in  the  table,  it  can  be  recognized  immediately  that  the 
jackknife  estimate,  regardless  of  the  application  method,  is 
often  too  low,  while  the  bootstrap  estimate  tends  may  a 
little  too  high  but  is  almost  balanced  in  the  number  of 
confidence  intervals  that  are  too  high  or  too  low.  It  is 
the  reason  that  the  bias-corrected  percentile  method  was  not 
required  in  this  case. 

In  this  simulation,  all  the  coverage  rates  fall  within 
±  2  J^a^'o ^  of  1-  e(  ,  (62.61,  73.38).  Note  that  the  average 
length  of  the  confidence  interval  in  the  gamma  service  time 
case  is  the  highest.  When  the  arrival  rate  was  known,  the 
gamma  service  time  case  had  the  smallest  average  length. 
This  indicates  that  the  variability  of  the  estimated  arrival 
rate  may  be  the  dominate  effect  in  the  width  of  the 
confidence  interval. 

The  results  of  Tables  IX  and  X  support  the  facts  of 
discussion  about  Table  VIII,  This  is  the  reasonable  since 
the  same  random  numbers  were  used  to  compute  these 
confidence  interval.  The  presentation  of  Table  IX  is 
exactly  the  same  as  the  case  of  table  VIII.  All  the 
coverage  rate  are  fall  within  (75.38,  84.61),  though  the 
value  of  coverage  rates  fluctuate  over  the  service  time 
distribution  cases.  Obviously  the  paired  jackknife 
procedure  performs  very  well.  The  bootstrap  procedure  still 
has  the  best  performance;  however  the  value  of  coverage  rate 
fluctuates  greatly  for  the  different  service  time 
distributions.  For  the  90%  confidence  interval  case 
reported  in  Table  X,  some  coverage  rate  fall  outside  the 
range  (86. 53,  93.46).  The  separated  jackknife  procedure 
produces  low  coverage  rates  outside  of  (86.53,  93.46), 
except  for  the  exponential  service  time  distribution.  The 
paired  jackknife  procedure  improved  the  coverage  rate 
tremendously.    Although   the  average  lengths  in   the  paired 
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jackknife  procedure  are  slightly  bigger  than  those  in  the 
bootstrap  procedure,  the  overall  performance  is  better  than 
the  bootstrap.  Furthermore,  the  procedure  in  the  case  of 
gamma  service  time  distribution,  the  bootstrap  produced  one 
coverage  rate  outside  of  (86.53,  93.46).  However,  this 
could  be  due  to  sampling  fluctuation. 

In  general,  all  the  confidence  interval  procedures 
performed  very  well  for  the  exponential  service  time  case, 
regardless  of  the  level  of  the  confidence  interval.  The 
procedures  also  worked  well  in  general  to  produce  68%  and 
80%  confidence  interval.  However,  performance  was  more 
variable  in  the  90%  confidence  interval  case.  In  most 
cases,  the  average  length  produced  by  the  bootstrap 
procedure  is  the  smallest,  but  the  value  of  the  coverage 
rate  fluctuates  for  different  service  time  distribution. 
The  overall  examination  of  the  tabulations  suggests  that  the 
paired  jackknife  procedure  performs  very  well  compared  to 
the  separated  jackknife  procedure  and  in  some  cases  shows 
better  performance  than  the  bootstrap  procedure. 
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TABLE  VIII 

COVERAGE  AND  LENGTH  OF  68%  C.  I. 
WITH  UNKNOWN  ARRIVAL  RATE  (N=50/   R=300,   B=1000) 


Jackknif e 
Separated       Paired 


Bootstrap 


Percentile 


Length 

(s.d) 


C.  R. 


Length 

(s.d) 


R. 


Length 
(s.d) 


C.  R. 


Exponential 
(  AA=  2) 


0. 2674 
(0.  1282) 


7.  00 
70.  00 
23.  00 


0.2525 
(0. 0570) 


9.  67 
70.  33 
20.  00 


0. 2436 
(0.  0510) 


18.  33 
66.  67 
15.  00 


Mixed  exponen 
(  mi  =2,  Mx=.  75 


0. 1999 
(0.  0832) 


12.  00 
65.  67 
22.  33 


0. 2077 
( 0. 0429) 


15.  00 
68.  00 
17.  00 


0. 2009 
( 0. 0398) 


21.  67 
64.  00 
14.  33 


Gamma 
(  0=1,  k=2) 


0.  2917 
( 0. 1376) 


10.  00 
68.  00 
22.  00 


0. 2746 
(0.  0599) 


13.  33 
67.  67 
19.  00 


0. 2632 
(0. 0557) 


21.  67 
66.  00 
12.  33 


Lognormal 
(§=.  193,  cf"=l) 


0. 2533 
(0.  0998) 


7.  67 
72.  00 
20.  33 


0.  2513 
(0.  0486) 


10.  00 
72.  33 
17.  67 


0.2415 
(0.  0461) 


19.  33 
69.  67 
11.  00 


TABLE  IX 

COVERAGE  AND  LENGTH  OF  80%  C.  I. 
WITH  UNKNOWN  ARRIVAL  RATE  (N=50,   R=300, 


B=1000) 


Jackknif e 

Bootstrap 

Separated 

Paired 

Percentile 

Length 
(s.d) 

C.  R. 

Length 
(s.d) 

C. 

R. 

Length 
(s.d) 

C.  R. 

Exponential 
(  M=  2) 

0. 3447 
(0.  1329) 

5.  33 

79.  67 
15.  00 

0. 3282 
(0. 0641) 

8. 
82. 

10. 

00 
00 
00 

0. 3162 
( 0. 0585) 

12.  67 

83.  00 

4.  33 

Mixed  exponen 
(M,  =2.  iTx=.  75 
P,  =.2) 

0.  2558 
(0.  1088) 

75.  67 

0. 2688 
( 0. 0569) 

5. 
82. 
12. 

67 
33 
00 

0. 2553 
( 0. 0490) 

12.  67 

83.  00 

4.  33 

Gamma 
(  (S=l,  k=2) 

0.  3646 
(0.  1509) 

5.  00 
78.  67 
19.  33 

0. 3504 
( 0. 0703) 

6. 
81. 
11. 

67 
67 
67 

0. 3375 
( 0. 0657) 

16.  33 

75.  67 

8.  00 

Lognormal 
(§=.  193,  cfVL) 

0. 3259 
(0.  1279) 

4.  67 
77.  67 
16.  67 

0.  3231 
(0.  0624) 

6. 
79. 
14. 

33 
33 
33 

0. 3115 
(0.  0585) 

14.  67 
75.  67 
10.  00 
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TABLE  X 

COVERAGE  AND  LENGTH  OF  90%  C.  I. 
WITH  UNKNOWN  ARRIVAL  RATE  (N=50,   R=300,   B=1000) 


Jackknif e 
Separated      Paired 


Bootstrap 


Percentile 
C.  R. 


Length 
(s.d) 


R. 


Length 
(s.d) 


R. 


Length 

(s.d) 


Exponential 
(  M=  2) 


0. 4464 
( 0. 1680) 


1.  33 

90.  00 

8.  67 


0. 4231 
(0.  0844) 


2.  67 

92.  33 
5.  00 


0. 4091 
(0.  0761) 


8.  67 

89.  00 

2.  33 


Mixed  exponen 
(m,  =2.  m*.=.  75 
P,  =.2) 


0.  3199 
(0.  1213) 


0.  67 
83.  33 
16.  00 


0. 3387 
( 0. 0679) 


2.  33 

88.  00 
9.  67 


0. 3301 
(0.  0609) 


8.  00 

86.  67 

5.  33 


Gamma 
(  £=1,  k=2) 


0. 4891 
(0.  2321) 


2.  00 

85.  33 
12.  67 


0. 4586 
(0.  1053) 


3.  33 

88.  67 
8.  00 


0. 4416 
( 0. 0959) 


9.  00 

85.  67 
5.  33 


Lognormal 
(§=.  193,  cf"=l) 


0. 4371 
(0.  1974) 


1.  67 
85.  33 
13.  33 


0. 4228 
(0.  0920) 


3.  00 

90.  00 
7.  00 


0. 4074 
(0.  0847) 


7.  00 

88.  67 

4.  33 
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VI.  SUMMARY  AND  CONCLUSIONS 

This  thesis  consider  the  problem  of  estimating  M(t),  the 
mean  number  of  customers  being  served  at  time  t  for  an 
M/G/oo  queue,  using  service  time  and  interarrival  time  data. 
It  is  assumed  that  there  are  no  customers  being  served  at 
time  0.  Two  cases  are  considered.  In  one  the  parametric 
form  of  the  service  time  distribution  is  assumed  known.  In 
this  case  M(t)  is  a  function  of  the  estimated  parameters. 
In  the  situation  in  which  the  arrival  rate  of  the  Poisson 
process  is  also  assumed  known  and  the  parametric  form  of  the 
service  time  distribution  is  exponential,  approximation  to 
the  bias  and  variance  of  the  estimate  are  derived.  Further, 
simulation  is  used  to  study  a  normal  confidence  interval 
procedure. 

For  the  other  case  the  parametric  form  of  the  service 
time  distribution  is  unknown.  The  empirical  distribution  of 
the  service  time  distribution  is  used  in  the  estimate  of 
M(t).  In  the  situation  in  which  the  arrival  rate  \  is 
assumed  known,  the  distribution  of  the  estimate  is  derived 
in  Appendix  A.  The  bootstrap  and  jackknife  estimates  with 
known  are  studied  in  Appendix  B  and  C.  Simulation  was  used 
to  assess  the  performance  of  confidence  interval  procedures 
using  a  normal  approximation  the  jackknife  and  the 
bootstrap.  The  simulation  results  for  the  case  in  which  the 
arrival  rate  \    is  known  indicate  that: 

(1)  The  parametric  estimation  method  appears  the  most 
powful  method  when  the  parametric  assumption  is 
correct,  but  the  performance  is  seriously  degraded  if 
the  assumption  is  not  appropriate. 

(2)  When  an  erroneous  parametric  (exponential)  model  is 
assumed,  the  initial  estimates  of  mean  number  in 
service  are  poor.  However,  as  t  *<o  ,  the  erroneous 
parametric  estimate  approaches  the  same  value  as  the 
other  estimates.  This  is  because  as  t  ■*<»  all  the 
estimates  approach  the  sample  mean  of  the  service 
time  data. 
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(3)  The  estimate  obtained  by  using  the  empirical 
distribution  is  unbiased  with  a  larger  variance  than 
a  parametric  estimate  based  on  a  correct  model. 

(4)  Simulation  results  indicate  there  is  not  much 
difference  between  jackknife  and  bootstrap  confidence 
interval  procedures. 


(5)   The    nonparametric    normal     confidence    interval 

procedure  performs   as  well  as   the  procedure   in  (4^ 

since   the  distribution   of   the   estimate  is   almos' 

symmetric.    The   improvement  by  the  use   of  adjustec 

degrees   of    freedom   in    the   jackknife    and   the 


symmetric.    The   improvement  by  the  use   of  adjusted 
degrees   of    freedom   in    the   jackknife    and   th 
bias-corrected  percentile  in  the  bootstrap  is  small. 


We  now  discuss  the  simulation  results  for  the  case  in 
which  the  arrival  rate  for  the  Poisson  process  is  also 
unknown  and  is  estimated  using  interarrival  time  data.  The 
service  times  are  generated  from  four  types  of  distribution. 
The  percentile  method  for  the  bootstrap  and  paired  and 
separated  techniques  for  the  jackknife  were  used  to 
construct  the  confidence  intervals.  Tables  I  and  II,  which 
are  the  results  of  a  parametric  confidence  interval 
procedure  in  chapter  III,  are  compared  with  the  results  of 
the  nonparametric  confidence  interval  procedures.  The 
simulation  results  indicate  that: 

(1)  The  nonparametric  confidence  interval  procedure  works 
as  well  as  the  parametric  case,  even  though  the 
length  of  the  confidence  interval  is  wider  than  the 
parametric  one. 

(2)  In  the  overall  examination,  the  percentile  method  of 
the  bootstrap  shows  the  best  performance.   The  paired 


jackknife  procedure  also  has  similiar  results  to  the 
bootstrap  approach.  The  results  of  these  two 
nonparametric  procedures  show  the  almost  same  level 
of  performance  with  the  parametric  one.  However,  the 
separated  jackknife  procedure  produces  poor  results. 

(3)  The  results  of  the  jackknife  procedures  produce 
intervals  that  are  always  biased  upward.  Efron 
(1981)  reported  similiar  results.   [ Ref .  13] 

(4)  Since  the  bootstrap  procedures  require  a  large  amount 
of  computation,  the  jackknife  is  the  method  of  choice 
if  one  does  not  want  to  do  the  bootstrap 
computations. 

In  general,   the  nonparametric   methods  of  the  bootstrap 

and   the   jackknife   performs   very   well,    regardless   the 

complexing  of  the   estimation  problem.    Of  cource,    if  the 

parametric  estimation  method  can  be  applied,  the  results  are 

clearly   superior.      However,    the   application   of   the 
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parametric  estimation  is  a  highly  limited  because  the 
parametric  assumption  is  often  difficult  to  verify.  When 
the  estimate  is  simple  enough,  which  is  the  nonparametric 
estimate  when  the  arrival  rate  is  known,  and  the  asymptotic 
distribution  of  estimate  can  be  obtained,  the  nonparametric 
normal  confidence  interval  procedure  performs  well,  and  more 
complicate  computations  such  as  the  jackknife  and  the 
bootstrap  method  are  not  required.  However,  the  jackknife 
and  the  bootstrap  method  have  a  good  performance  for  the 
more  complicated  problem  in  which  the  arrival  rate  is 
unknown.  The  bootstrap  confidence  intervals  show  the  best 
performance  but  the  paired  jackknife  procedure  achieve  the 
same  level  of  performance  with  less  computation  than  the 
bootstrap  in  this  problem. 
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APPENDIX  A 
CALCULATING  THE  BIAS  AND  THE  VARIANCE  OF  M^T)  WITH  KNOWN 

ARRIVAL  RATE 

It  will  be  assume  that  the  rate  of  Poisson  process  \  is 
known  and  is  equal  to  1.  The  nonparametric  estimate  M^(t) 
is  given  by 

Mn(t)  =  £[1-F(s)]da  (A.  1) 

Using  the  empirical  cumulative  density  function  F^ ,  the 
nonparametric  estimate  is 

Mw(t)  =  --Is,.  +  "—  t  (A.  2) 

where  the  observation  SUJs  are  the  order  statistics  of  the 
indeprndent  and  identically  distributed  service  times  with 
unknown  distribution  F.  To  find  the  distribution  of  M^(t), 
we  will  study  the  distribution  of  {S^j. 

Let  X, ,XX, .  .  .  ,X-  be  independent,  identically  distributed 
random  variable  with  distribution  function  E    Let  N  be  the 
number  of  X; ' s  which  are  less  than  t.   Let  X^. denote  the  i 
smallest  X;.   By  the  definition  of  conditional  probability, 

P{X^  x|Nt  =  l}  =  Ll^-V. (A.  3) 

for  x<t.  Since  the  random  variable  Nr  has  a  binomial 
distribution  with  a  parameter  F(t),  we  can  rewrite  the 
equation  A.  3  to  obtain 

-  ^{i)  (A. 4) 
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Let   f(x)    be   the   density  of   the   distribution   function  F.       The 
conditional  probability  given  1^=2, 
J?[x^Xjxx+   dx,  ,    xJX^fx^   dxJNv=2} 

=   2    L^J.ih    5S&i*h.  (A.  5) 

7lO  TOO 

for  x  (<xa. 

Given  N^K,  the  conditional  distribution  of  the  values 
of  the  unordered  Xj  that  lie  in  (0,t]  is  that  of  independent 
random  variables  with  distribution  function  -^r—  ,  for  0^x<t. 
Thus,  given  Nt=K,  M^t)  has  the  same  distribution  as  a 
constant  plus  the  sum  of  K  independent  identically 
distributed  random  variables.  Thus  the  expectation  of  M^(T) 
can  computed  by  the  property  of  conditional  expectation, 

ElM^t)]  =  E[E[M^(t)  |Nt]  ]  (A.  6) 

Given  Nt=K/ 

E[MfJ(t)|Nt=K]  =^  jo  K1^    -t   -ppt  (A.7) 

Since  the   random  variable   Nt  has   a  binomial   distribution 
with  the  parameter  F(t), 


=  5 


t 
xF(dx)  +  [  1-F(t)]t  (A.8) 


To   check   the   bias   of  the   estimate  M  ^  (t),    using   the 

integration  by  part   of  equation  A.  1,  the   true  estimate  is 
given  by 

M^t)  =  t[l-F(t)]  +  j^tF(dx)  (A.9) 
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Thus  the   estimate  M^(t)    is  unbiased.     Using  conditional 
expectations,  the  variance  is  computed  by 
Var[Md(t)]  =  E[(M,j(t)  -  EtM^t)])2] 

=  E[(M,|(t)  -  E[M,j(t)|Nt]  )2] 

+  2E[(MM(t)  -  ElM^t)  |N±]  )(E[MrJ(t)|Nt]  -  E[MM(t)])] 

+  E[(E[Mw(t)  |Nt]  -E[MM(t)])2]  (A.10) 

Computing  each  of  the  individual  terms,  we  obtain 
E[(Mw(t)  -  E[Mw(t)|Nt]  )2] 

and 

E[(E[Mu(t)|Nt]  -  E[MJt)]  )2] 

=  -1-  F(t)F(t)[  Px  -  —  -  -  t]  2  (A.  12) 

where  F  is  the  survival  distribution  function  of  F.  The 
second  term  of  right  term  of  equation  A.  10  turn  out  to  be 
zero.  Thus  the  variance  of  M^i(t)  is  obtained  by  sum  of  two 
equations.   The  resulting  variance  is 

Var[MM(t)]  =  -K-    {£x2F(dx)  -  [  £  xF(  dx)  ]  2+t2F(  t)F(  t) 

-  2tF(t)[  £xF(dx)]  }  (A.  13) 

Notice  that  the  variance  estimate  of  M^(t)  is  equal  to 
-v-Var(X)  as  t-»oo  .   A  nonparametric  estimate  of  Var[M,j(t)]  is 

Var[MM(t)]  =  --{-■z^-s,2    -  [  --  S.  &..  ]  2  +  t2-„-  [1-  •»■] 
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K   i  K 
-  2t  --  [--Is.  j  (A.  14) 

A 

where  K  is  the  number  of  service  times  that  are  less  than  t. 
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APPENDIX  B 
JACKKNIFE  ESTIMATE  OF  NONPARAMETRIC  ESTIMATE 


It  will  be  assumed  that  the  arrival  rate  \  is  known  and 
equal  to  1.   The  nonparametric  estimate  is  given  by 


A 


A 

M,(t)  =  -'-is,.  +  --^-  t 


(B.l) 


where  S,  ' s  are  the  order  statistics  of  the  service  times  and 
CO 

assumed  that   the  random  variable  K  exists   such  that 


4  S.„£t^  S  4.  .  £  S 


*»*%> 


The  estimate  MA(t)   for  the  data  set 
obtained  by  delecting  the  i    point  from  the  sample  is 
fei(t)  =  f  -1-1  a.  +^J.-A_  t     if  i  >  K 

A, 


I  n-\  j**  °->    n-\ 


if  i  £  K 


(B.2) 


j3^ 


The  pseudo-value  M^(t)  is  computed  by 


M,(t)  =  nM(t)  -  (n-DMJt) 


0A\ 


(B.3) 


where  M   (t)  is  the  estimate  of  MKj(t)  based  on  all  the  data. 


(All 


substituting  the  estimate  M  H(t)  in  equation  B.3,  we  get 


M:(t) 


=(  S 


U) 


if  ii  K 


t     if  i  >  K 
Since   the   jackknife    estimate   is   the   average 
pseudo-values,  the  estimate  is  given  by 


A 

K 


MT(  t)  =  -Is,.+ 


(B.4) 


of   the 


(B.5) 
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Thus,  the  jackknife  estimate  is  the  same  as  the  original 
nonparametric  estimate  of  M^t).  The  jackknife  estimate  of 
variance  is 

Var[M3(t)]  =  -L-[A.J(M.(t)}2  -  l-^^t)!2!      (B.6) 

where  A 

Jr^,(t)}2  =  JC^-  (n-K)t2 
and  A  * 

[^M.(t)}«  =  [4^+  (n-K)t]2 
Thus  the  variance  cai\  be  rewri  ^cen  as      A 

Lf.Lis2   .   rJ-la  I2   +   t2^--   -K- 


Var[  Mj(  t )  ]  =    -  L-  (  -;-  &*&  -    [  -;-±  ^  ■ 


n-K,   I 


-  2*an5l~wlJJ  (B-7) 

Comparing  equation  B. 7  with  equation  A. 14,   it  is  seen  that 


the   jackknife  estimate   of   variance   is  greater   than   the 


estimate  of  equation  A.  14  by  a  multiplicative  constant  ---, 
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APPENDIX  C 
CONDITIONAL  DISTRIBUTION  OF  BOOTSTRAP  ESTIMATE 

It  will   be  assumed   that  the   arrival  rate  \  is  known, 
and  is  equal   to  1.    Let  S,,Sx/..,Sn   denote  random  service 
times  from  a  CDF  F,  and  let  S  ^S  <..iS   <t£   S  ^..<Sn.  denote 

the   corresponding   order   statistics.     The   nonparametric 

r*- 

estimate  of  J0F(s)ds  is 

M»(t)  =  "n-g^>+  -Tf  t  (C.  1) 

Let  B^,  i=l  to  n,  be  independent  random  variables  having  the 
same  distribution  as  draws  with  replacement  from  ( s,  , sx , . .  , s. 
),    and   let  b-  ,i  =  l   to   n,    be  the   corresponding   order 
statistics.    A  bootstrap  realization  of   the  nonparametric 
estimate  is 

v'i  -  i  F>>+  -Vn-II(%>it)lt  (C-2) 

where 

I(x  4  t)    =C    1  if   x^t 


0  otherwise 

To  compute  the  distribution  of  M^(t),   the  Laplace  transform 

-\ 
is  used  [ Ref .  12].   The  Laplace  transform  of  Me(t)  is 

E[exp(-§MB(T))]  =  E[E[exp(-$MB(t))|N*]]  (C.3) 

by  the  property  of  conditional  expectations,  where  N-t  is  the 
number  of  bootstrap  samples  which  are  less  than  or  equal  to 
t.  We  compute  the  right  hand  side  of  equation  C.  3 
separately.   First 

E[exp(-§  MB(t))  |N<_=  1] 
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\ 


=  exp(-$t)[exp(-^-)i«exp(-§-^)]  (C.  4) 

is  computed,    where  the  random  variable  Nt  has   a  binomial 
distribution  with  the   parameter   F(t)=  -^-  .    Thus,    from 

A 

equation  C.  3  the  Laplace  transform  of  M^(t)  is 

E[exp(-  M  (t)]  =  exp(-^t)§[exp(-|4)|r-^exP(-i4'-)J 

=  exp(-^t)[-^exp(|-^)^exp(-i-S^)  +  ^-]n      (C.  5) 

Let   us  define   a   random  variable   Y  having  the   following 
distribution 

Y  =j-if     w.p-^-     if  i<  K  (C.6) 

—    w.  p  *L*    if  i  >  K 
Then  the  Laplace  transform  of  Y  is 

k 
E[exp(-§>Y)]  =  J-Xexpt-,!-^-)  +  *£«*><  -$-$ )        (C7) 

Thus,   M^  (t)   has   the  same   distribution  as   the  sum   of  n 
independent  random  variables  having  the  same  distribution  as 
Y.    For  the  fixed  time  t,  given  the  order  statistics,   Sciy<^i) 
<.  .  <S  <t<S  <.  .  <S     the  expectation  of  Mo(t)  is  written  by 
E[M5(  t)  |  data]  =nE[Y|data] 

=  -is   +  ---  t  (C.  8) 

Thus,   the  bootstrap  estimate   is  asymptotically  an  unbiased 
estimate  of  M,j(t).   The  variance  is 

Var[MB(t)  |  data]  =  nVar[  Y]  (  C.  9) 

The  variance   of  Y   can  be   derived  using   the  equation   C.  6 
Since 
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E[Y2]     =    J-£[5tf2]«    +    !>=&(£)*  (CIO) 

Hence   the    asymptotic   bootstrap  variance   estimate  of  MN(t)    is 
given  by                                           *                       «>                          ^  A 

Var[MB(t)|datai    =   Jl  (  -L  |  ^  -    (  -<-|  %j]  2    ♦    t'-«~  B=5 

-2t-V-lTlw2'  (c-11* 

That  is,   the   asymptotic  bootstrap  estimate  of  variance  is 

the  same  as  the  nonparametric  estimate  (equation  A. 14). 
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